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EXECUTIVE  8UMHARY 


PROBLEM 

The  avlaticn  training  research  examining  the  effectiveness  of  simulator 
training  has  been  so  diverse  that  the  results  of  the  Individual  investigations 
have  been  difficult  to  combine.  Traditional  narrative  reviews  have  produced 
inconsistent  conclusions.  This  has  resulted  in  an  Inability  to  derive 
specific  guidance  for  training  design  from  the  accvunulated  research. 

OBJECTIVE 

The  objective  of  this  review  was  to  apply  recent  advances  in  data 
Integration  to  the  aviation  simulator  training  effectiveness  research,  in 
order  to  identify  those  characteristics  that  have  an  impact  on  training 
outcomes.  Areas  of  Interest  included:  1)  imput  variables  (task  equipment, 
task  requirements,  and  trainee  characteristics);  and  2)  throughput  variables 
(simulator  design  and  training  context). 

APPROACH 

A  total  of  2A7  Journal  articles  and  technical  reports  that  addressed 
aviation  training  were  located.  From  this  base,  experiments  which  included 
training  transfer  to  the  actual  equipment  were  selected.  A  quantitative 
review  approach  (collectively  referred  to  as  meta>analysls)  was  applied  to 
those  experiments  that  reported  the  Information  required  for  the  statistical 
analysis.  A  total  of  26  experiments  (19  involving  Jet  aircraft  and  seven 
involving  helicopters)  were  Included  in  the  final  meta- analysis . 

COMCLU8IOM8 

The  research  reviewed  for  this  analysis  demonstrated  that  the  use 
of  simulators  consistently  produced  superior  training  for  Jet  pilots 
(relative  to  aircraft-only  training).  Since  the  analysis  included  such 
a  small  number  of  helicopter  experiments,  no  conclusion  on  the  training 
effectiveness  of  the  helicopter  simulators  could  be  drawn.  Motion  cuing  was 
found  not  to  add  significantly  to  the  training  for  Jet  pilots,  and  in  some 
cases,  may  have  detracted  from  the  treining.  The  conclusions  concerning  the 
training  outcomes  for  motion-based  simulators  were  considered  highly  tentative 
due  to  methods  that  had  been  used  when  the  motion- releted  experiments  were 
conducted.  There  were  too  few  experiments  comparing  training  in  motion  based 
simulators  to  training  with  no  motion  for  helicopter  pilots  for  analysis  to  be 
done.  In  general,  training  outcomes  appear  to  be  influenced  considerably  by 
the  type  of  cask  trained  and  the  amount  and  type  of  training  given. 

Several  specific  training  variables  were  examined.  The  findings  from 
these  areas  are  as  follows; 
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Task  Equipment 

The  outcomes  of  the  experiments  Involving  the  training  of  Jet  pilots  were 
different  from  those  involving  the  training  of  helicopter  pilots.  Results 
differed  in  both  size  and  pattern  of  training  outcomes,  Jet  experiments 
consistently  found  simulator  training  combined  with  aircraft  training  to  be 
better  chan  training  in  the  aircraft  alone.  The  findings  from  similar 
helicopter  experiments  were  less  consistent,  and  only  slightly  favored 
simulator  training  combined  with  aircraft  training  over  aircraft  training 
alone . 

An  insufficient  number  of  helicopter  experiments  (total  N->7)  precluded 
any  in-depth  analysis  involving  this  type  of  aircraft,  Therefore  the  results 
of  Che  meta-analysis  are  specific  to  Jet  aircraft  training  Involving  recenu 
Undergraduate  Pilot  Training  (UPT)  graduates  or  current  trainees  with  little 
or  no  experience  in  a  simulator  or  in  a  Jet  aircraft. 

For  Jets,  the  overall  training  effect  for  all  tasks  trained  was  positive 
and  robust.  Over  90  percent  of  the  experimental  comparisons  favored  the 
simulator  and  aircraft  trained  group  over  the  aircraft-only  trained  group,  On 
the  average,  subjective  performance  measures  (e.g, ,  Instructor  racings)  were 
more  sensitive  to  training  effects,  and  produced  greater  results  chan  those 
obtained  with  objective  measures  (e.g,,  instrument  readings).  As  training  for 
both  groups  progressed  and  reached  the  point  where  it  was  conducted  solely  in 
Che  aircraft,  differences  between  Che  groups  diminished. 

Task  Reaulremanta 

Certain  casks  were  more  effectively  trained  in  the  simulator  than  others. 
For  Jets,  when  simulators  were  used  for  the  training  of  takeoff,  approach  (to 
landing),  and  landing  (excluding  carrier  landings)  casks,  the  training  effects 
were  greater  than  they  were  for  the  combination  of  all  tasks. 

Trainee  Characteristics 

Only  two  trainee  characteristics  were  identified  as  likely  to  have  an 
effect  on  training  results,  flight  experience  end  UPT  grades.  These 
differences  in  trainees  were  rarely  studied.  When  there  was  concern  chat 
these  differences  might  affect  training  in  any  single  experiment,  an  effort 
was  made  to  compose  each  of  Che  tciiinee  groups  with  equal  amounts  of 
experience  or  similar  grades. 

Simulator  Design 

For  Jet  training,  motion  cuing  was  found  to  add  nothing  to  the  siiiRilntor 
training  effectiveness,  and  in  some  cases,  may  have  tnken  away  from  the 
training  value  of  the  simulator.  However,  this  finding  may  not  be  truly 
representative  of  the  effectiveness  of  motion-based  training  since:  1)  chef- 
was  a  lack  of  periodic  calibration  of  the  motion  cuing  systems;  and  2)  tlie 
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results  were  based  on  all  tasks  combined.  The  positive  effects  of  motion  for 
any  one  task  may  have  been  masked  by  the  negative  effects  of  motion  for 
another  task. 

TrtttntnR  Cgntixt 

The  average  effectiveness  for  training  programs  where  trainees  were 
allowed  to  progress  based  on  a  demonstrated  proficiency  was  greater  than  for 
training  programs  where  all  trainees  proceeded  at  the  same  pace.  Information 
on  other  aspects  of  the  training  context,  such  as  the  use  of  Instructional 
features  and  the  provi.‘’lon  of  feedback  was  seldom  reported  and  could  not, 
therefore,  be  analyzed. 


% 
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INTRODUCTION 

Since  the  introduction  of  flight  eimulators  to  aviation  training, 
questions  about  the  most  effootlva  designs  of  the  resulting  systems  have 
proliferated.  There  long  has  been  the  expectation  that  ansvers  to  these 
questions  are  to  be  found  in  the  extensive  research  that  has  evaluated  the  use 
of  flight  simulators  in  training.  However,  individual  research  results  have 
been  mixed,  and  narrative  reviews  of  the  research  domain  have  failed  to 
provide  specific  training  system  requirements  for  producing  the  most 
favorable  training  outcomes. 

Several  reasons  may  be  cited  for  the  disparity  of  results  in  this  domain. 
For  example,  Caro,  in  1973,  noted  that  the  primary  focus  of  research  efforts 
had  been  to  determine  the  effectiveness  of  a  particular  simulator  within  a 
given  training  program,  rather  than  to  manipulate  various  training  system 
elements,  in  order  to  develop  general  design  principles.  Nearly  teti  years 
later,  Uaag  (1981),  in  reviewing  more  recent  research,  made  the  same 
observation. 

The  narrow  research  focus  for  experimentation  within  this  area  was 
adopted,  in  part,  from  practical  requirements,  The  cost  of  a  full  mission 
simulator,  and  the  operational  aircraft  required  for  aaaessl;.g  transfer  of 
training  (TOT)  dictate  that  most  research  in  aviation  training  must  occur  in 
an  on-going  training  program.  Conducting  research  in  this  setting 
necessitates  that  the  research  be  secondary  to  safety  factors  and  to  the 
training  program.  This  almost  inevitably  interferes  with  experimental  control 
(Osborne,  Broyles,  &  Quick,  1983)  and  dictates  chat  the  research  will  be 
designed  to  answer  the  one  question  that  is  most  pressing  for  the  training 
organization;  Does  this  simulator  train? 

Another  factor  that  helps  shape  research  objectives  rises  from  the 
underlying  assumption  regarding  the  role  of  the  simulator  within  the  training 
program  (Eddowes  &  Waag,  1980).  Due  to  efforts  aimed  at  saving  on  the  cost  of 
operating  the  aircraft,  Che  simulator,  in  many  cases,  has  been  considered  a 
substitute  aircraft  rather  than  a  training  device.  This  view  places  emphasis 
on  the  similarity  of  the  elmulator  to  the  operational  equipment,  and  tends  to 
de-emphasize  the  Investigation  of  other  elemente  that  may  have  an  impact  on 
training.  In  support  of  this  viewpoint,  there  is  theory  and  research 
suggesting  that  increasing  the  common  elements  that  exist  between  the  training 
and  operational  environments  increases  transfer  of  training  (Osgood,  1949; 
Thorndike,  1903). 

According  to  Eddowes  and  Waag  (1980),  the  simulator  can  alternatively  be 
viewed  as  a  teaching  tool,  and  its  effectiveness  can  be  Improved  In  ways  other 
than  by  Increasing  its  physical  similarity  to  the  operational  equipment,  This 
assumption  Is  supported  by  mounting  experimental  results  indicating  that 
positive  training  outcomes  may  be  realized  using  simulators  that  do  not  have  a 
high  physical  resemblance  to  the  operational  aircraft  (see  e.g.,  Caro,  Corley. 
Spears,  &  Blaiwes,  1984).  Prophet  and  Boyd  (1970)  demont;trated  that 
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procedural  training  could  be  Just  as  affective  when  trainees  practiced  on  a 
simple  mockup  of  a  cockpit  as  when  they  trained  in  a  sophisticated 
computerized  trainer.  Finally,  there  is  convincing  evidence  indicating  that 
the  effectiveness  of  a  flight  simulator  varies  according  to  the  training 
method  used  (sea  e.g.,  Bailey,  Hughes,  &  Jones,  1980), 

The  term  commonly  used  to  describe  the  degree  of  realism  between  the 
simulated  and  operational  ervlronments  is  fidelity.  Fidelity  has  been  used  in 
a  variety  of  contexts  and  has  been  given  a  number  of  definitions  (Hays,  1980), 
For  this  review,  the  term  fidelity  is  based  on  the  definition  presented  by 
Hays  and  Singer  (1989). 

Simulation  fidelity  is  the  degree  of  similarity  between  the  training 
situation  and  the  operational  situation  which  is  simulated.  It  is  a 
two  dimensional  measurement  of  this  similarity  in  terms  of;  (1)  the 
physical  characteristics,  for  example,  visual,  spatial,  kinesthetic, 
etc.;  and  (2)  the  functional  characteristics,  for  example,  the 
informational,  and  stimulus  and  response  options  of  the  training 
situation  (Hays  &  Singer,  1989,  p.  50), 

As  this  definition  makes  clear,  the  realism  of  simulation  is  a  complex 
concept.  The  majority  of  research  has  examined  only  the  physical  aspects  of 
this  concept,  rather  than  the  functional  aspects. 

With  so  much  variance  in  experimental  objectives  and  in  training 
orientations,  there  are  no  individual  experiments  with  aviation  training 
devices  that  can  answer  questions  of  general  interest  for  training. 
Integration  is  necessary  to  determine  if  this  body  of  research  can  produce 
guidance  for  training  system  design  and  for  future  research. 

RESEARCH  INTEGRATION 

The  traditional  form  of  review,  the  narrative  review,  has  been  unequal  to 
the  task  of  integrating  the  results  from  diverse  experiments.  Reviewers  of 
aviation  training  research  are  required  to  make  a  large  number  of  Judgement.^ 
in  combining  the  information  that  provides  a  summary  of  the  research  area. 
These  Judgments  Include  comparisons  of  issues  associated  with:  experimental 
control,  training  tasks,  level  of  trainees,  length  of  training  programs,  and 
the  relative  value  of  one  reported  statistic  over  another.  The  number  of 
decisions  to  be  made  is  large,  and  the  reviewer  has  no  guidance  in  making 
these  decisions.  Furthermore,  since  he/she  is  not  required  to  document  in  the 
review  how  the  decisions  were  made,  it  is  difficult  for  the  reader  to  assess 
the  relative  value  of  two  different  reviews.  To  correct  for  apparent 
shortcomings  inherent  in  the  narrative  review  method  (see  Jackson,  1960), 
several  quantitative  review  techniques,  collectively  known  as  meta*analysis , 
have  been  developed  (Glass,  McGaw  6i  Smith,  1981;  Hunter,  Schmidt  &  Jackson, 
1982;  Rosenthal,  1978), 
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META-ANALYSIS 

Meca-analysls  aeaks  to  aggregate  and  transform  individual  research 
outcomes  Into  a  common  effect  size  metric  (e.g. ,  or  £) ,  then  to  compute  a 
mean  value  across  experiments  to  obtain  a  good  estimate  of  the  population 
value  (Glass  at  al,,  1981).  While  the  techniques  Involved  may  vary, 
meta-analytlc  reviews  are  becoming  an  Increasingly  popular  tool  for  social 
scientists  (Bangert- Drowns,  1986).  One  important  advantage  of  meta-analysis 
over  the  narrative  review  method  Is  the  explicit  Information  provided  on  the 
decision  processes  used  by  the  reviewer  (Mullen  &  Rosenthal,  1983). 

Mullen  and  Rosenthal  (1985)  note  that  the  combination  of  some  research 
characteristics  relies  heavily  on  the  subjective  decision-making  processes  of 
the  reviewer.  They  presented  several  decisions  that  may  critically  affect  the 
outcome  of  a  meta-analytlc  review,  Including:  1)  the  choice,  coding,  and  use 
of  research  characteristics;  2)  decisions  inherent  in  the  data/retrieval 
reconstruction  process;  and  3)  methods  used  to  summarize  results  across 
experiments.  The  approach  they  recommend  for  appropriately  dealing  with 
subjective  decision  processes  is  to,  "...make  explicit  the  rationale  behind, 
and  the  procedures  underlying,  those  coding  schemes  used,"  (p.  18),  and  to 

impart  reliability  to  the  decision-making  process  by  using  several  coders. 

At  least  two  different  approaches  to  meta-analysis  have  emerged,  These 
approaches  reflect  different  philosophies  concerning  variation  in  effect  sizes 
(Mathleu  &  Tannenbaum,  1983;  see  also  Dickinson,  Hassett,  &  Tannenbaum,  1986). 
The  first,  advocated  by  Glass  at  al.  (1981),  assumes  chat  the  variability  in 
effect  sizes  within  a  given  domain  is  due  to  moderator  variables.  For 
example,  training  effectiveness  could  ba  modified  by  characteristics  of  the 
simulator,  the  trainees,  the  Instructor,  or  other  moderator  variables. 
According  to  this  approach,  effect  sizes  are  regressed  upon  the  moderator 
variables  of  Interest  and  the  resulting  outcome  Is  used  to  explain  differences 
between  the  research  effect  sizes. 

Another  approach  to  meta-analysls  la  based  on  the  work  by  Hunter  et  al. 
(1962).  This  approach  differs  from  the  Olasslan  (1961)  approach,  in  that  it 
is  morn  conservative  with  regard  to  moderator  variables.  Specifically,  Hunter 
et  al.  (1982)  caution  that  variation  of  effect  sizes  may  partially  result  from 
such  artifacts  as;  1)  sampling  error;  2)  measurement  unreliability;  and  3) 
range  restriction.  Their  approach  advocates  correcting  for  these  artifacts 
prior  CO  the  search  for  valid  moderator  variables.  It  follows  that  if 
sufficient  unexplained  variability  Inherent  In  the  effect  sizes  remains  after 
removing  error  variance  from  the  above  three  sources,  Chen  a  search  for 
moderator  variables  is  warranted.  In  general,  the  Hunter  et  al,  (1981’) 
approach  is  more  conservative  than  the  Glassian  (1981>  approach  because  it 
ininliiiizes  the  likelihood  of  incorrectly  Inferring  a  valid  moderator  exists. 
This  review  Incorporates  meta-analytlc  procedures  advocated  by  Hunter  et 
al ,  1,1982),  although  formulas  from  Glass  et  al.  (1981)  were  used  to 

derive  effect  size  values  within  a  given  experiment. 
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In  sum,  several  factors  have  made  It  difficult  to  accurately  determine 
what  variables  affect  flight  simulator  training  outcomes,  and  to  what  degree 
they  do  so.  One  important  factor  is  the  overly  narrow  focus  of  individual 
experiments  that  make  up  the  research  in  this  area.  Traditional  narrative 
reviews  of  this  domain  have  failed  to  extract  the  information  that  would  allow 
specification  of  training  princlplei.  An  alternative  to  the  narrative  review 
is  meta-analysis,  which  employs  quantitative  review  techniques. 

OBJECTIVES 

In  light  of  the  above,  the  objectives  for  this  research  were  to  conduct 
meta-analysis  in  order  to;  1)  identify  variables  that  affect  flight  simulator 
training  outcomes;  2)  identify  information  gaps  in  the  literature  (i.e., 
variables  of  interest  that  have  yet  to  be  systematically  evaluated);  and  3) 
provide  direction  for  future  research.  Satisfaction  of  these  objectives  will 
contribute  to  the  larger  objective  of  improved  flight  training  for  military 
pilots . 
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METHOD 


In  order  to  analyze  reiearch  reporta  in  the  aviation  trainins  area,  with 
regard  to  variablaa  that  may  affeot  training  outoomee,  a  conceptual  meta-model 
waa  developed,  The  meta-model  ia  preaented  in  Figure  1,  and  Includes  the  two 
broad  areas  thought  to  have  an  impact  on  the  magnitude  of  training  outcomes; 
1)  input  variables,  such  as  the  requirements  of  the  task,  task  equipment,  and 
trainee  characteristics;  and  2)  throughput  variables,  including  simulator 
design  and  training  context.  These  areas  ware  selected  from  training 
variables  that  were  most  consistently  identified  as  important  in  various 
reviews  of  the  simulator  training  research  cited  below.  With  the  exception  of 
trainee  characteristics,  the  areas  of  interest  in  the  meta-model  are 
consistent  with  those  specified  by  Wheaton  et  al,  (1976),  Trainee 
characteristics  were  added  to  the  present  review  because  there  is  a  growing 
interest  in  the  area  of  individual  differences  and  how  such  differences  may 
affect  training  (see  e,g,,  Hogan,  Arneaon,  6t  Salas,  1987;  Jones,  Kennedy, 
Turnage,  Kuntz,  £*  Jones,  1986), 


• 

, , ,>  Simulator  Design 

Task  Equipment 

Training  Outcomes 

Task  Rsquiremants  . 

. . •>  Trainee  Performance 

Trainee  Characteristics 

Treining  Context 

. . .>  -Training  Type . 

•Performance  Meeeurement 

Figure  1. 

The  Meta-Model  for  Simulator  Fidelity 

LITERATURE  SEARCH 


Experiments  within  the  domain  of  simulator  fidelity  and  training 
ef fecclvensaa  were  first  identified  through  An  Annotated  Bibllofraohy  of 

Abicracca an the  use  of  simulators  for  Technical  Training  (Ayres,  Hays, 

Singer,  &  Helnicke,  1984),  This  document  was  compiled  after  literature 
searches  for  the  years  1937  to  1982,  Additional  searches  were  conducted  for 
articles  for  the  years  1982  to  1966,  Other  experiments  were  located  through 
the  reference  lists  in  the  obtained  articles  and  in  a  published  search  from 
the  U.S.  Department  of  Commerce  on  Plight  Simulator  Training  (December,  19’’'^ 
to  November,  1985).  An  open  letter  was  given  to  all  attendees  of  the  Fourth 
Annual  Flight  Simulation  Update  Conference  •  1988,  requesting  recent  articles, 
published  or  unpublished,  that  addressed  transfer  of  training  in  aviation.  In 
addition,  individual  researchers  were  contacted  and  asked  for  information  on 
relevant  research  that  may  not  have  been  included  in  the  published  domain 
(e.g,,  articles  in  press  or  in  preparation),  Finally,  the  Technical 


19 


Technical  Report  89-006 


Infornaclon  Center  at  the  Naval  Training  Systems  Canter,  Orlando,  was  searched 
for  published  technical  reports  within  the  aviation  training  domain. 

Table  1  presents  a  listing  of  the  primary  sources  used  to  obtain 
Information  for  this  meta-analysis .  Key  words  that  were  used  when  locating 
experiments  were:  simulation  training,  training  devices,  simulator  fidelity, 
training  device  requirements,  transfer  of  training,  training  effectiveness 
evaluation,  simulator  cost  effectiveness,  fidelity  guidance,  computer 
simulation,  simulated  environment,  flight  training,  military  training,  and  Job 
training, 

n  total  of  247  Journal  articles,  book  chapters,  and  technical  reports  on 
training  effectiveness  were  collected.  The  literature  was  divided  Into  four 
categories :  reference  materials,  aviation  device  empirical  research, 
empirical  research  on  other  devices,  and  non-relevant  Information,  The 
refa  r«:ic«  lualarials  wore  reviewed  and  added  to  the  data  base  if  they 
contributed  to  the  understanding  of  the  empirical  research.  Appendix  A  lists 
experiments  excluded  from  the  meta-analysis  and  reasons  why  each  was  rejected. 
Only  the  experiments  that  Involvtd  training  with  a  simulator  and  transfer  to 
operational  aqulpment  wara  ratalnad.  Of  thoie  experlmants,  only  the  ones  that 
reported  the  necessary  statistics  for  meta- analysis  could  be  Included  In  the 
research  Integration,  If  an  experlmant  lacked  sufficient  statistics,  efforts 
were  made  to  contact  those  who  had  oonductad  the  experiment  to  see  if  they 
could  supply  the  necessary  data, 

CODE  SHEET 

A  code  sheet  was  developed  for  use  in  extracting  data  from  the  collected 
reaearch,  Thle  code  sheet  was  baaed  on  the  meta-model,  and  Its  purpose  was  to 
ensure  that  the  critical  Information  for  this  analysis  would  be  collected  from 
each  report. 

The  Initial  version  of  the  code  sheet,  presented  in  Appendix  B,  lists:  1} 
classification  of  task  equipment;  2)  training  context  variables:  3)  the 
trainir.g  task;  4)  trainee  characteristics;  and  3)  areas  related  to  research 
design  characteristics  and  sample  population.  Simulator  design  and  fidelity 
level  although  part  of  the  meta-model)  were  not  Included  in  the  Initial  code 
sheet  A  sampling  of  the  literature  Indicated  that  Information  describing  the 
'.'arioui  systems  which  combine  to  make  up  the  simulator,  such  as  those  related 
to  motion  and  visual  display,  varied  considerably  from  report  to  report.  In 
many  instances,  one  or  more  secondary  sources  were  cited  in  lieu  of  a  detailed 
description  of  the  simulator.  Coding  of  fidelity  Issues  was  delayed  vintil 
more  information  on  the  simulators  could  be  gathered. 

The  Initial  code  sheet  included  the  topic  areas  as  major  headings.  An 
area  Above  the  headings  was  used  for  recording  useful  statistics  and  report 
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Tabla  1 

Primary  Sources  Used  to  Locate  Relevant  Experiments 


Source 

Year <b) 

1.  Computer  Search  • 

1982-1986 

ERIC 

NTIS 

Psychological  Abstracts 

2,  L.iterature  Reviews  and  Bibliographies 

AGARD>AR-159 

1981 

Ayers,  Hays,  Singer,  &  Heinioke 

1984 

Caro 

1977 

Hays  &  Singer 

1989 

Kinkade  &  Wheaton 

1972 

Martin 

1981 

Waag 

1981 

Wheaton,  Rose,  Fingerman,  Korotkln,  Holding, 

A  Mirabella 

1976 

identification  information.  The  first  25  experiments  were  reviewed  by  three 
individuals  who  coded  the  same  experiments  and  discussed  important 
characteristics  found  for  each  topic  area,  Through  these  discussions, 
potentially  useful  characteristics  were  identified  and  a  more  formalized  code 
shoot  was  developed. 

Appendix  C  Hats  items  that  were  listed  in  an  early  version  of  the  code 
sheet,  but  were  eliminated  due  to  lack  of  information  in  the  research  report.?. 
These  items  include  information  related  to  research  design,  level  of  aiimilfltnr 
fidelity,  and  training  characteristics  (e.g.,  proficiency  ba.sed  vs,  blocked 
deslgr. ,  number  of  training  trials).  The  final  code  sheet,  shown  in  Appendix 
D,  was  developed  through  a  series  of  iterations  based  on  coders'  discussiotiu . 
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CODER  TRAINING 

Groups  of  two  or  more  coders  mat  periodically  during  the  coding  process 
to  define  coding  response  categories  more  precisely.  The  definitions  for  the 
coding  sheet,  along  with  any  caveats,  were  recorded  in  a  code  book  chat  was 
referred  to  during  the  coding  process. 

The  purpose  of  the  coda  book  was  to  improve  Interrater  reliability  by 
providing  standard  information  to  the  coders.  The  coda  book  offered 
guidelines  for  selecting  among  coding  categories,  response  categories, 
location  of  Information,  and  correct  calculations  for  numerical  responses, 
The  code  book  is  presented  in  Appendix  B  with  a  description  of  information 
included  for  each  area, 

As  noted  in  the  previous  section,  coding  the  simulator  configuration  and 
fidelity  level  topic  area  was  particularly  problematic.  Hays  and  Singer 
(1989)  provided  a  conceptual  framework  that  guided  initial  efforts  in  coding 
aspects  of  simulator  and  simulation  fidelity.  Pragmatic  issues  were  explored 
through  disousiions  with  engineers  whose  expertise  included  simulator  design 
and  development.  It  wSs  necessary  to  contact  knowledgeable  persons  (e.g., 
primary  investigators  or  Naval  Training  Systems  Center  project  managers 
familiar  with  each  device)  to  fill  in  information  gaps  pertaining  to  simulator 
configuration  and  fidelity  level.  When  possible,  several  persons  were 
contacted  as  a  means  of  corroborating  this  information.  The  final  response 
categories  for  this  area  took  into  consideration  both  conceptual  and  pragmatic 
concerns,  and  fused  them  with  the  additional  constraint  of  availability  of 
requisite  information. 

It  should  be  noted  that  the  fidelity  level  of  a  simulator  was  determined 
only  for  the  individual  subsystems  that  make  up  the  simulator  (e.g,,  visual, 
motion,  sound).  No  attempt  was  made  Co  give  an  overall  fidelity  rating,  since 
it  is  literally  impossible  to  assess  the  relative  contribution  of  any 
subsystem  to  the  simulator  as  a  whole. 

CODING  PROCEDURE 

At  least  two  coders  Independently  coded  all  research.  The  completed  code 
sheets  were  discussed,  and  all  discrepancies  were  resolved  by  consensus 
decision,  A  consensus  decision  procoee  was  used  instead  of  a  pooling 
procedure  because;  1)  many  coding  responses  were  discrete;  2)  the  consensu.? 
decision  process  served  aa  a  continual  form  of  training  for  the  coders;  and  3) 
coders  would  cite  information  directly  from  the  report  to  substantiate  their 
coding  response,  thereby  incraaelng  the  thoroughness  of  the  coding  cask.  The 
independently  coded  responses  ware  used  to  calculate  interrater  reliability 
estimates,  and  the  coneeneus* derived  coded  reaponees  were  used  when  performing 
all  ocher  analyses. 
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INTERBATER  RELIABILITY 

Among  the  several  Interrater  reliability  indices  suggested  in  the 
literature,  two  are  applicable  to  the  coding  procedure  used  in  this  research. 
In  general,  indices  that  capped  intercatar  agreemant,  as  opposed  to  Interrater 
consistency,  were  used,  since  the  latter  index  allows  for  the  possibility  of 
having  different  coded  responses  with  a  demonstrated  perfect  interrater 
reliability  (Tinsely  &  Weiss,  1975;  Jonas,  Johnson,  Butler,  &  Main,  1983;  see 
also  Dickinson  at  al.,  1986).  For  discrete  response  items,  Cohen's  kappa 
(Cohen,  1968)  was  calculated  because  there  were  at  least  three  response 
classification  categories  for  all  but  two  items,  which  were  dichotomous  in 
nature.  The  Cohen's  kappa  formula  produces  values  ranging  from  -  1.0  to  1.0, 
with  zero  (0)  indicating  chance  agreement  and  1.0  indicating  perfect 
agreement.  For  continuous  response  items,  an  intraclass  correlation 
coefficient  (ICC)  was  calculated  that  provides  an  indication  of  the  degree  to 
which  the  two  coders'  responses  are  intarohangeable  (Shrout  &  Fliess,  1979). 
This  coefficient  was  used  here  because  two  coders  were  responsible  for  coding 
the  experiments  after  the  initial  phases  of  formalizing  the  coding  sheet  were 
completed, 

The  reliability  estimates  indicatad  that  moderate  to  high  levels  of 
interrater  reliability  were  obtained  using  the  coding  procedure.  For  all 
discrete  response  items,  the  mean  Cohen's  kappa  value  was  .67  and  ranged  from 
.34  to  .92.  When  items  having  no  variability  were  deleted,  the  mean  kappa 
value  was  .69  (range  was  from  .63  to  .94),  For  continuous  response  items,  the 
ICC  (2,1)  value  was  .95. 

CALCUUTION  OF  RESEARCH  EFFECTS 

There  are  several  training  outcome  effect  size  (ES)  estimates  that  could 
be  used  for  summarizing  the  experiments  used  in  this  report.  Glass  «t  al . 
(1981)  advocate  use  of  what  is  commonly  referred  to  as  the  $1 
(difference)  statistic,  calculated  by  subtracting  the  mean  performance  scores 
of  the  experimental  and  control  groups,  then  dividing  this  difference  by  the 
control  group  standard  deviation  (uso  of  a  pooled  standard  deviation  has  also 
been  suggested).  However,  Hunter  et  al.  (1982)  note  that  this  ES  estimate  is 
strongly  dependent  on  sampling  error.  These  researchers  advocate  use  of 
either  a  biserial  or  point  biserial  correlation  (.oefficienc  for  ueveral 
important  reasons:  first,  blserial  and  point  biserial  statistics  can  be 
corrected  for  statistical  biases  from  sampling  error,  measurement  error,  and 
restriction  of  range,  (for  both  the  measurement  and  criterion  variables); 
second,  they  can  be  transformed  into  the  d  statistic,  and  are  thereby  readily 
interpretable;  and  third,  both  types  of  correlation  coefficients  can  be  used 
with  multivariate  analysis  techniques,  which  have  been  found  useful  for 
analyzing  research  characteristics  to  identify  potential  moderator  variables 
(e.g,,  Dickinson  at  al,,  1986;  Hunter,  et  al.,  1982). 

The  point  blserial  correlation  coefficient  was  chosen  for  use  in  this 
review  because  the  separation  of  subjects  into  either  an  experimental  or 
control  group  established  a  "true"  dichotomy,  a  primary  consideration  for 
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determining  appropriate  uae  of  the  point  biserlal  correlation  coefficient 
(Isaac  &  Michael,  1978,  p.l26).  The  other  criterion,  that  the  performance 
measure  be  continuous  in  nature,  also  applies  to  experiments  used  In  this 
review. 

A  detailed  description  of  procedures  used  to  convert  one  or  more  research 
statistics  into  a  weighted  mean  point  biserial  correlation  coefficient, 
denoted  as  RFB,  is  given  in  Appendix  F.  In  general,  the  procedures  chosen 
were  those  that  would  produce  conservative  RPBs  for  Individual  experiments, 
and  thus  the  overall  (population)  RPB  may  be  viewed  as  a  conservative 
estimate  of  the  flight  simulator  training  effectiveness. 

According  to  the  Hunter  et  el.  (1982)  approach,  variability  involving 
criterion  performance  measures  should  be  corrected  for  sampling  error, 
unreliability,  and  range  restriction  whenever  possible.  In  this  analysis, 
only  the  correction  for  sampling  error  was  used  becauiu!  1)  entire  classes  of 
Undergraduate  Pilot  Training  (UPT)  graduates  (or  current  Undergraduate  Pilot 
trainees)  were  used  as  subjects.  In  many  cases;  and  2)  usually,  sampling  error 
accounts  for  a  majority  of  the  spurious  error  relative  to  the  other  two 
sources  (Schmitt,  Gooding,  Noe,  6t  Kirsch,  1934). 

With  regard  to  transfer  effectiveness,  two  major  problems  preelujigd 
attaching  a  dollar  figure  juL.  tlme/tralning  savings  figure  to  a  given  RPB 
value.  First,  cumulative  RPB  values  reported  here  collapse  across  different 
training  programs,  simulators,  and  tasks.  Training  effectiveness  measures  are 
highly  dependent  on  training,  equipment,  and  task  variables  (Orlansky  and 
String,  1977).  A  second  problem  is  related  to  the  rapidity  of  technical 
advances  in  this  domain.  Many  of  the  experiments  included  In  this  report  were 
completed  over  ten  years  ago.  Technology  has  advanced  to  such  a  degree  since 
then  chat  cost  savings  or  ocher  training  effectiveness  metrics  related  to 
these  results  may  not  be  applicable  within  current  simulator  training 
programs . 

Outcome  measurements  that  were  directly  or  indirectly  based  o’-.  some  form 
of  evaluator  rating  wore  considered  subjective  in  nature.  Instructor  pilot 
(IP)  ratings  wore  the  most  common  assessment  technique  for  experiments 
reported  in  this  review.  Even  seemingly  objective  measures,  such  as 
trials-to-proficlanoy,  when  proficiency  was  based  on  IP  judgment,  were 
classified  as  subjective.  Only  measures  chat  were  based  on  clearly  objective 
Indices,  such  as  recording  of  instrument  readings  at  selected  points  during  a 
flight-control  maneuver  (Martin  &  Waag,  1978b),  were  considered  objective, 

Initial  and  final  transfer  trial  measures  were  coded  for  nil  experiments 
that  specifically  reported  this  information.  Final  tran.sfer  trial  information 
was  not  standard,  The  actual  trial  number  used  to  calculate  the  final 
transfer  trial  RPBs  ranged  between  the  third  and  seventh  transfer  trial  across 
experiments . 

The  other  measure  used  to  evaluate  training  effectiveness  in  this  review 
was  the  percentage  of  negative  research  statistics.  This  measure  is  a  ratio 
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of  the  number  of  research  statistics  that  were  negative  In  value  (i.e., 
Instances  where  the  control  group  performance  was  superior  to  that  of  the 
experimental  group),  divided  by  the  total  number  of  research  statistics  used 
to  calculate  a  specific  RPB.  Thus,  for  a  given  experiment,  a  percent 
negative  research  statistic  measure  was  calculated  for  each  valid  RPB  area 
(e.g.,  overall,  objective  only,  subjective  only)  produced  for  the  experiment. 
Although  the  percent  negative  research  statistic  metric  is  an  Indirect  measure 
of  training  effectiveness,  It  does  provide  Information  about  the  consistency 
with  which  experimental  training  outcomes  favored  the  experimental  or  control 
group , 

PROCEDURE  FOR  DETERMINING  MODERATOR  VARIABLES 

Experiments  were  coded  according  to  both  continuous  and  discrete  response 
Items,  Since  the  number  of  experiments  for  each  type  of  aircraft  was  small, 
and  particularly  so  for  helicopters.  Inclusion  of  response  categories  for 
subsequent  analysis  could  not  be  guided  by  multivariate  statistical  procedures 
<e,g,,  multiple  regression  and  factor  analysis)  found  useful  In  other  meta* 

analytic  reviews  (Dickinson  et  al.,  1986;  Hunter  et  al,,  1982).  Instead, 

Individual  response  categories  were  examined  using  descriptive  measures  to 
assess  whether  there  existed  sufficient  variability  for  follow>up  analysis, 
Next,  correlation  coefficients  were  calculated  between  selected  response 
categories  and  each  dependent  measure,  and  between  each  of  the  remaining 
Independent  variables.  Finally,  potential  moderator  variables  that  were 
Identified  by  correlational  analysis  were  examined  further  using  subgroup 
analysis  outlined  by  Hunter  et  al.  (1962,  p.  105;  see  also  Dickinson  et  al., 
1986). 

The  procedure  for  determining  If  a  variable  Is  a  moderacor_^ing  subgroup 
analysis  was  as  follows.  The  weighted  mean  effect  size  (RPB),  observed 

variance,  error  variance,  and  ''true"  variance  for  the  total  group  and  for 

individual  subgroupings  of _ experiments  were  compared.  Valid  moderator 

variables  produce  different  RPB  estimates  for  separate  subgroups  when  compared 
to  each  other.  More  Importantly,  the  "true"  variance  for  the  individual 
subgroups  is  reduced  relative  to  the  total  group.  This  reduction  Indicates 
that  partitioning  the  total  data  set  Into  twu  or  more  subgroups  Is 
appropriate,  since  these  subgroups  are  more  homogeneous  In  nature  (I.e.,  show 
less  variability)  relative  to  the  total  group, 

A  rule-of-thumb  for  determining  whether  subgroup  analysis  is  appropriate 
is  given  by  Pearlman,  Schmidt,  and  Hunter  (1980).  This  rule  states  that  25 
percent  or  more  unexplained  variance  must  remain  after  correcting  for  research 
artifacts  for  the  total  group,  before  It  Is  appropriate  to  look  for  moderator 
variables . 
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RESULTS 


OVERVIEW 

A  total  of  26  transfer  of  training  experiments  ware  coded;  19  for  Jet 
aircraft  and  seven  for  helicopters.  Table  2  presents  a  brief  suitunary  of 
important  information  from  each  of  the  experiments.  The  last  two  columns 
describe  the  research  .Ttatistics  and  the  weighted  mean  point  biserlal 
correlation  coefficients  (RPBs),  respectively.  RPBs  were  calculated  for  five 
areas:  1)  the  overall  training  outcome  effect  (averaged  across  task  type, 
transfer  trial,  and  type  of  outcome  measure);  2)  objective  measures  only;  3) 
subjective  measures  only;  4)  initial  transfer  trial;  and  5)  final  transfer 
trial . 


Table  3  presents  a  breakdown  of  the  total  number  of  experiments  included 
in  the  meta-analysis  based  on  aircraft  type  and  experiment  type.  These 
breakdown  variables  were  both  conceptually  and  empirically  based. 
Conceptually,  since  aircraft  (task  equipment)  differ  immensely  in  appearance 
and  aerodynamics  (Jets  are  different  from  propeller-driven  aircraft;  end  both' 
of  these  aircraft  are  very  different  from  vertical  takeoff- and- landing 
aircraft) ,  the  pattern  of  training  outcomes  could  be  expected  to  differ  as 
well.  Empirically,  jprevlous  reviews  of  flight  simulation  training;  literature 
have  noted  that  findings  from  one  type  of  aircraft  do  not  necessarily 
generalize  to  other  aircraft  (Martin,  1981;  Orlansky  &  String,  1977). 
Findings  reported  here  support  this  view. 

In  addition  to  aircraft  type,  it  was  expected  that  different  types  of 
experiments  would  produce  dissimilar  training  outcomes.  Previous  analyses  of 
experiments  support  the  contention  that  experiments  that  compare  simulator 
training  with  no  simulator  training  show  different  results,  as  a  group,  from 
those  experiments  that  compare  motion-based  simulators  with  no  motion 
simulators  (Orlansky  &  String,  1977;  Martin,  1981).  Subgroup  analysis  done 
for  this  research  indicate  that  collapsing  across  these  two  types  of 
experiments  is  not  meaningful, 

PRELIMINARY  ANALYSIS 

Task  EQuiement 

Prior  to  any  other  analysis,  an  analysis  was  performed  examining  whether 
experiments  using  either  Jets  or  helicopters  should  be  treated  separately  or 
should  be  combined.  This  analysis  waa  based  on  the  meta-model  which  suggested 
that  the  actual  task  equipment  (an  input  variable)  may  affect  the  training 
outcome.  Appendix  G  presents  the  results  of  the  subgroup  analysis  based  on 
aircraft  type,  Indicating  a  substantial  difference  between  and  helicopter 
e-.'Spei'iments .  The  percent  unexplained  variance  for  the  combined  (total)  group 
was  .37,  thereby  exceeding  the  minimum  .25  suggested  by  Pearlman  et  al. 

(  1980 1  .  The  subgroup  analysis  revealed  substantial  differences  between 
the  RPBs  of  jet  and  helicopter  experiments.  •i'.ven  when  results  wei'e  collapsed 
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Table  2 

Sununary  of  Important  Information  for  Experiments  Included  in  the  Meta-analysis 

Total  N  Simulator/ 


1. 

Ryan  et  al. 

UPT  grads. 

124 

(4) 

(1972)/NTEC 

[Navy] 

2. 

Browing  et  al . 

UPT  grads. 

26 

(2) 

(1973)/TAEO 

[Navy] 

3. 

Brlctson  & 

CAT  I  Fleet 

53 

<2) 

Burger  (1976)/ 

replacement 

NTEC 

pilots/varied. 

experience 

[Navy] 

4. 

Payne  et  al. 

UPT;  grads . 

16 

(2) 

<1976)/ 

&  additional 

Northrop  Corp. 

pilots  with 

varied  experience 

5. 

Woodruff  et  al, 

UPT  grads. 

16 

(2) 

(1976)/AFHRL 

[Air  Force] 

2F90  (CFT)/  Basic  instrument  maneu- 
TA-4J  vers-B  stage  of  advanced 

Jet  phase  training 

2F69D  (OFT)  Tasks  related  to  109 
with  2C23A  item  procedures/ 
(CFT)/P-3  systems  checklist 

(NCLT)/A‘7E  Night  carrier  landings 


(LAS -WAVS)/  8  air  combat  maneu- 

P-4J  vers 


ASPT/T-37  Total  of  4  tasks  from 
basic  to  nevigacion 


6,  Browning  et  al,  UPT  grads.  34  (2)  2F87F  (OFT)  20  aircraft  control 

(1977)/TAEG  [Navy]  compared  to  maneuvers 

2F69D  (OFT)/ 

P-3 

7.  G*.ay  &  Fuller  UPT  grads.  24  (3)  ASPT/F-5B  10,  15,  &  30  degree 

(1977)/AFHRL  (Air  Force]  bomb  delivery  runs 


8.  Browning  et  al.  UPT  grads.  37  (2)  2FB7F  (OFT)/  22  tasks  of  varying 
(1978)/TAEG  with  advanced  difficulty 

flight  training 
[Navy] 
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Tftbl*  2  (eontlnuad) 


Weighted  Mean  Point  Biserial 

t  si  SI  -  v  P.  1 

1. 

15  total  hours- 

Reported  2  value. 

Slm..jUL.  A/C  only  Trng. 

8hr8  smargsnoy 

£-test  using  IP 

RBfid)  -  *206  (63) 

procedures,  Thrs 
basic  Lnstrunant/ 
navigation 

check  flight  raw 
scores  (Table  6, 
p.  16) 

RPB(3)  -  .206  (63) 

2. 

9  total  trials 

Reported  £  value 

Sim,  vs.  A/C  only  Trng. 

lasting  8  wka. 

<<,05)  and  group 

££2(1)  -  .354  (26) 

6  in  2F69D  & 

3  in  2C23A 

means  (p.26) 

RPB(3)  -  .354  (26) 

3. 

Avg,  of  80  trials  Reported  £•  tests 

Full.^.  limited  Sim.  Trng. 

(ball"  control 

using  IP  ratings 

ETJftd)  -  -072  (144.67) 

passes) 

&  ob j ,  perf.  meas, 

££fi(2)  -  .065  (219.8) 

' 

(Table  A-1,  p.68) 

RPB(3)  -  .113  (50.75) 

4. 

6  trials,  1  hr 

Converted  U  values 

Slm.jsUL.  A/C  only  Trng, 

per  trial 

to  £  values  (Figs. 
11-13,  pp.  41,42,44, 

&  text  pp.  36-38) 

B£A(1)  -  .402  (16) 
RPB(3)  -  .402  (16) 

3. 

Varied  •  mean  # 

£• tests  calculated 

Sim,  vs.  A/C  only  Trng. 

hours  -  25.5 

using  raw  hrs  to  RPBfl)  -  .547  (16) 
proficiency  (Table  2,  RPB(3)  -  .547  (16) 
p.lO)  &  IP  ratings  (p.l2) 

6. 

6  trials,  2hrs 

£- tests  calculated 

Slm..j2g.  A/C  only  Trng. 

per  trial 

using  reported  mean  A£2(l)  ■■  >606  (34) 
flights  CO  proficiency  RPB(3)  -  .606  (34) 
(Table  4,  p.  24) 

7. 

8  trials,  Ihr  per 

Reported  £- tests 

Motion  vs.  No -motion 

trial 

(p.  13)  and  Chi 
square  values  ^2) 

££2(1)  -  .001  (16) 
££2(2)  -  .017  (16) 
RPB(3)  4  .046  (16) 

8. 

6  trials 

£-te8ts  calculated 
from  reported  means 
(Table  4,  p.  17) 

Sim._2Ui.  A/C  only  Trng, 
££2(1)  -  .598  (37) 
RPB(3)  -  .598  (37) 

Note:  SEfl(l)“Overall;  RPBf2)-0bj ■  meas.  only;  RPB(3)-SubJ.  me as.  only: 
RPfl(4)-InlClal  transfer;  RPB(5)-Flnal  transfer; 
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Table  2  (continued) 


Total  N  Simulator/ 


Author  (vrWSouree 

(U  Orpa^ 

9.  &  10,  Martin  & 

UP  trainees 

24  (3) 

aSPT/T-37 

7  basic  flight  control 

Waag  (1978a)/ 

with  least 

maneuvers 

AFHRL 

flight  experience 

[Air  Force] 

11.  &  12.  Martin  & 

UPT  grads. 

36  (3) 

ASPT/T.37 

8  aerobatic  flight 

Waag  (1978b)/ 
AFHRL 

[Air  Force] 

maneuvers 

13,  &  14,  Ryan  et 

UPT ‘ grads . 

95 

(4)  2F87F  (OFT)/ 

3  landing  tasks 

al.  (1978)/TAEG 

[Navy] 

P-3 

15.  Nataupaky  et 

UPT  grads. 

32 

(4)  ASPT/T-37 

3  basic  control 

al.  (1979)/ 

[Air  Force] 

maneuvers 

AFHRL 

16.  Reed  &  Reed 

Student 

21 

(3)  Air  refueling 

10  tasks  related  to 

(1979)/AFHRL 

pilots  [Air 

director  lights 

air  refueling 

Force] 

tralner/F-4C  & 

KC-135 

17.  Martin  & 

UP  trainees 

24 

(3)  ASPT/T-37 

3  basic  control 

Cataneo  (1980) 

(13  were  AF 

maneuvers 

/AniRL 

Academy  grads.) 

18.  Pierce  (1983) 

UPT  grads. 

40 

(2)  ASPT/A-10 

5  basicjl  control 

/AFHRL 

[Air  Force] 

maneuvers 
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Tabl*  2  (continued) 


Rapore  #  Training  Ungth 


Weighted  Mean  Point  Blserlal 
Raeaareh  Statletlee  Correlation  ^Maan  N) 


9.  &  10.  10  trials 


11.  &  12.  5  trials, 

5,5  total  hrs. 

13.  &  14.  6  trials 

15,  4  trials 

16,  1  hour 

17,  3  trials  • 

1  hr  per  trial 

18,  5  trials  •  total 


Reported  £• tests 
(Table  E3,  p.38)  & 
calculated  £• tests 
from  IP  ratings 
(APP.  01,  pp. 30-32) 

Reported  £• tests 
(Table  03,  pp.  29-30) 


£  value  (<.0S), 

(p.  12  •  group  C-3 
vs.  E) 

Reported  £• tests 
(Table  9,  p,  14) 

£-test  calculated 
froB  raw  IP  ratings 
(Table  3,  p.  19) 


Reported  £• tests 
(Table  9.  p.  18) 


Reported  £- tests 
(Tables  C-3  to  C-6, 
pp.  30-35) 


SJju.  vs.  A/C  only  Trng. 
RPB(1)&(3)  -  ,552  (24) 
Motion  vs.  No -Mot Ion 
RPBIU&(3)  -  .081  (14.7) 
&£A(4)  -  .094  (14) 
RPB(5)  •  .069  (15.4) 
Slin..jU>  A/C  only  Trng. 
m(l)  -  .130  (36) 
m(2)  -  .023  (36) 
RPB(3)  -  .248  (36) 
Motlcn  vs.  No -motion 
££fi<l)  -  .101  (24) 
m(2)  -  .201  (24) 
RPB(3)  -  -.01  (24) 
£lin.  va.  A/C  only  Trng. 
RPB(l)6i(2)  -  .383  (29) 
llatilon  vs.  No-notion 
RPB(l)6i(2)  -  -.297  (50) 
tlRgion  vs,  No -motion 
RPBX116i(3)  -  .138  (30) 
RPB(2)  -  .112  (30) 
Sim,  vs.  A/C  only  Trng. 
B£a(l)  -  .141  (21) 
B£a(2)  -  .072  (21) 
m(3)  -  .211  (21) 
B£A(4)  -  .444  (21) 
RPB(3)  -  -.304  (21) 
Sim-  va.  A/C  only  Trng. 
RPBXli6i(3)  -  .301  (23) 
m(4)  -  .265  (23) 
RPB(S)  -  ,329  (23) 
Sim,  va.  A/C  only  Trng. 
££fi(l)  -  1.35  (40) 
&Efl(2)  -  .095  (40) 
RPB(3)  -Vl48  (40) 


Note :  £££(l)-.0varall ;  RPB(2)-ObXi_meas.  only;  RPB(3)-SubJ.  mess,  only; 
RPB(4)-Inltlal  transfer;  RPB(5)>Flnal  transfer; 
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Table  2  (continued) 


Total  N  Simulator/ 

Author,  fyr^yRouree  _  Population,  _(#.  Grpe>  Aircraft - TyPR,,  iRiKl 


19,  Weetra  et  al,  Studente  126  (2)  VTRS/T-2C 
(1986)/NTSC  entering  FCLP 
phaee  of  trng. 


Carrier  qualification 
landlnga  &  field 
carrier  landing 
practice  (FCLP) 


20,  &  21,  Caro  Warrant  Officer  132 
St  laley  (1966)  in  4  week  rotary 
/HumRRO  wing  course  [Army] 


(4)  Whlrlymlte 
helicopter 
tralner/OH-23D 


Basic  contact  flight 
maneuvers 


22.  Holman 
(1979)/ARI 


Student  pilots  59  (2)  Helicopter  32  control  tasks ,  basic 
[Army]  flight  slm,/  and  advanced 

CH-47 


23.  &  24,  Isley  Student  pilots  145  (3)  1-CA-I/TH-13T 
et  al.  (1968)/  (Instrument  phase) 

HumRRO  [Army] 


4  tactical  Instrument 
flight  maneuvers 


25,  McDaniel  at  UFT  grads, 
al.  (1983)/  [Navy] 


26  (2)  2F64C  (OFT)  9  flight  control 

&  2C44  (CPT)/  maneuvers 

SH-3H  (Sea  King) 


26.  Caro  et  al.  CAT  I  UPT 
(1964)/NTEC  grads  [Navy] 


22  (2)  LCCPT  &  2C44  19  Casks  common  to  both 

(CFT)/SH-3H  simulators 
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Tabla  2  (oontlnuad) 


Waightad  Maan  Point  Bisarial 
Rasaaroh  Statlatioa _ Corralatton  fMaan  N) _ 

19. 

Study  varlabla, 
20,  40,  or  60 
trials 

Raportad  £-tasts 
(Tablaa  10,  12,  & 

14,  pp.  26,  40,  &  43) 

a£fl(l)  -  .113  (126) 

&£g(2>  -  .162  (126) 
m(3)  -  .014  (126) 

RPB(4)  -  .133  (126) 

20.  &  21. 

Study  varlabla, 
aithar  3  or  7hrs 

£•  tests  oalculatad  vs.  A/C  only  Trng. 

from  IP  ratings  &  RPB(l)  ,  (3)  ,6i(4)-. 033(45) 

total  flight  tlma  _ Limitad  vs.  Full  Trng. 

(Tablas  2  &  5,  pp.  RPB(l) , (3) ,&(4)- .001(51. 33) 
42-43) 

22, 

Minimum  Ihr  for 
basic  tasks  &  15 
hra  for  advanoad 

Raportad  g-tasta  & 
oaloulatad  £-tasts 
from  raw  aooraa 

JU.m.  vs.  A/C  only  Trng. 
RPgII)6i(3)  -  ,076  (59.5) 
RPB(5)  -  ,073  (61) 

23.  &  2A, 

Study  varlabla, 
aithar  10  or  20 
hrs  •  total  8 

waaks 

g-tasts  oaloulatad 
from  raportad  IP 
ratings  &  arror  ratas 
(Tablaa  6.  7.  &  9, 
pp.  13-15) 

SlflL..vs.  A/C  only  Trng. 
&Ea(l)  -  ‘.028.  (65.71) 
a£A(2)  -  -.099  (40.5) 
fi£g(3)  -  -.021  (69.92) 
RPB(4)  -  .016  (89,5) 
Limigad  vs,  Full  Trng. 
RPBfl)  -  -.027  (56.17) 
£££(2)  -  -.061  (39) 
B£fi(3)  -  -.022  (59.6) 
RPB(4)  -  0.0  (63) 

25. 

12  trials,  1  hr, 
45  mins,  par 
trial 

Raportad  correlation  Sim,  vs.  A/C  only  Trng. 
coaffloiants  (Tablas  RPB(1)&(3)  -  .205  (25.5) 

6  &  12,  pp,26  &  32)  & 
f-tasta  oaloulatad  from 
tnaan  trials  to  prof. 

(Tablas  5  6  11.  pp. 25  &  31) 

26. 

6  trials,  2.5 
hrs  par  trial 

Raportad  g- tests 
(Table  C-l,  p.48) 

Low  vs .  High  Fidslity 
Simulatiorli  Training 
RPB(1)&(3)  -  .314  (20) 

Note :  fiE4(l)“0verall ;  RPB(2)-ObJ_t_mBa« ,  only;  RPB(3)-SubJ.  meas.  only; 
RPB(4)-Initlal  transfer;  RPB(5)«Final  transfer; 
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Table  3 

Breakdown  of  Simulator  Training  Experiments  Based  on 
Mrcraft  Type  and  Experiment  Type 


Breakdown  Variable 

Combined 

Group  Aircraft  type  Experiment  type 


(Al)  Simulator  vs,  Aircraft-Only 
Training 

(N-10  /  .26) 

(A) 

Jet 

Experiments  (A2)  Motion  vs.  No  Motion 
(N-L9  /  RPB-  .15)  (N-5  /  RPB-  -.05) 

(A3)  Other 

(N-^  /  .19) 

All 

Experiments  _ 

(N-26  /  RPB-  .13) 

(Bl)  Simulator  vs,  Alrc  raff. -Only 
Training 

(B)  (N-3  /  RPB-  ,02) 
Helicopter 

E,xpe rliiiunts  (B2)  Motion  vs.  No  Motion 

(N-7  /  RPB-  ,04)  (N-1  /  RPB-  N/A) 

(B3)  Other 

(N-3  /  RPB-  ,04) 


Note .  "N"  refers  to  the  number  of  experiments  at  each  level  of  breakdown 
variable.  RPB  Is  the  weighted  mean  point  blserlal  correlation 
coefficient  computed  for  a  specific  breakdown  level,  N/A  Lndlc.'iuea 
Insufficient  number  of  experiments  to  compute  a  RPB. 
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across  expariotnc  cypa,  thar*  ramalnad  stark  diffarencas  bacwean  Jets  and 
helicopters  (RPBs  equal  .15  and  ,04,  raspaotlvaly) .  In  Table  3,  (4)  versus 
(B;  denotes  this  ooitiparlion.  A  more  valid  comparison,  and  the  one  used  for 
the  subgroup  analysis,  Involved  contrasting  results  from  similar  axperlmenta 
for  each  type  of  aircraft,  This  contrast  is  even  more  pronounced  (RPBs  are  .26 
and  .02  for  Jet  and  helicopter  experimenta;  see  Table  3,  (Al)  versus  (Bl)). 
In  addition,  there  was  a  reduction  of  the  *'true"  variance  for  Jet  experiments 
(.011)  and  helicopter  experiments  (0.00)  compared  to  the  variance  of  the  total 
group  (.015). 

The  correlational  and  subgroup  analyses  for  helicopters  were  not 
conducted  due  to  the  paucity  of  useable  experiments.  The  remainder  of  the 
findings  from  the  meta>analyals  are  reported  below,  These  findings  pertain  to 
Jet  experimenta  only, 

Ruiiiflh,  jB,b  JiiLUieu 

To  test  whether  all  experiments  Involving  Jets  should  be  viewed  together 
or  separately,  a  second  analysis  was  performed,  based  on  the  stated  research 
goals.  This  analysis  was  conduoted  to  assure  that  experiments  that  are 
dissimilar  in  important  ways  were  not  combined  to  provide  meaningless 
results,  The  results  of  this  analysis  are  presented  In  Appendix  H.  It  was 
found  that  experiments  comparing  simulator  and  aircraft>only  training  appear 
to  be  substantially  different  from  those  that  investiffste  the  effectiveness  of 
simulator  motion  In  training  (note  differences  in  RPBs  for  (Al)  and  (A2) 
subgroups  in  Table  3).  The  four  remaining  experiments  (“other"  category) 
could  not  logically  be  combined  with  either  of  the  other  two  types  of  Jet 
experiments,  and  no  further  analyses  could  be  performed  with  them.  These 
four  experiments  Include!  1)  two  that  compare  full  simulator  training  versus 
lliniced  training  (Brictson  &  Burger,  1976;  Pierce,  1933);  2)  one  that 
compares  training  using  an  older  end  supposedly  lower  fidelity  simulator  with 
a  newer,  higher  fidelity  simulator  (Browning,  Ryan,  Scott  &  Smode ,  1977);  and, 
3)  one  chat  compares  the  combined  use  of  a  cockpit  familiarization  trainer 
(CFT;  and  an  operational  flight  trainer  (OFT)  with  the  OFT  alone  (Browning, 
Ryan  &  Scott ,  1973) , 

In  summary,  initial  analyses  of  research  data  demonctrate  that  task 
equipment  does  have  an  effect  on  training  outcomes,  and  supports  the 
separation  of  simulator  training  outcomes  across  different  aircraft  types. 
They  also  support  the  separation  of  training  results  from  experiments  that 
differ  substantially  In  design  characteristics . 

ind  Mttn  Yilmi 

Frequency  data  for  experimenta  involving  Jets  provide  useful  Information 
about  the  research  domain  (see  Appendix  D  tor  frequencies  of  tndivlcluiil 
ruspunse  categories).  All  experiments  In  this  anuly.sis  Involving  jet.*^;  '.•.fre 
reported  In  technical  reports,  and  most  experiments  (N-10)  compared  simuiator 
versus  alrc raf t -only  trained  groups.  Five  others  compared  subjects  ciralned  on 
a  simulator  with  the  motion  system  turned  on,  with  .subject.^  trained  with  the 
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motion  turned  off.  With  the  exception  of  two  experiments ,  all  subjects  were 
current  UFT  trainees  or  were  recent  UPT  graduates. 

As  to  simulator  design  features  (part  of  throughput  in  the  meta-model, 
page  19),  most  experiments  reviewed  here  were  performed  using  whole  task 
trainers  with  a  computer  generated  image  (COX)  system  and  a  motion  system 
having  between  one  and  six  degrees  of  freedom  (DOF) .  Field  of  view  was 
reported  in  some  experiments,  with  little  variation.  G-seats  were  used 
infrequently  and  use  of  0*suits  was  reported  in  only  s  few  experiments.  All 
experiments  used  subjective  measures,  tied  directly  or  indirectly  to 
instructor  pilot  ratings,  although  only  one  reported  intra-rater  reliability 
estimates.  Leas  than  a  third  explicitly  reported  Information  that  would  allow 
measurement  of  initial  or  final  transfer  performance. 

SIMULATOR  TBAIMINOi  JETS 

fltnigiL  IlndAngn _ gtoulstgg.Yi  Mo  glsniUtog 

Table  4  presents  RPBa,  mean  percent  negative  research  statistic  values, 
and  tflRgl  number  of  experimenta  for  the  five  result  olassifioation  categories. 
The  RPB  reported  in  the  ''overall  effect  site"  category  (equal  to  .26)  is 
identical  to  the  "simulator  versus  aircraft -only  training"  category  presented 
in  Table  3  (level  Al).  These  data  indicate  that  experiments  using  objective 
measures  reported  smaller  training  outcomes  than  those  using  subjective 
measures  (Table  4,  (2)  versus  (3)).  The  data  also  show  that  the  RPB  for  the 
initial  transfer  trial  was  noticeably  greater  than  for  the  final  transfer 
trial  (<4)  versus  (5)),  although  the  data  should  be  viewed  with  caution  due  to 
the  small  number  of  experiments  used  to  calculate  these  values.  Finally,  the 
low  mean  percent  negative  values  ehow,jj3at  the  majority  of  the  research 
statiatica  used  to  calculate  individual  RPBs  were  positive.  This  indicates  a 
consistent  training  effect  across  the  performance  measures  ussd. 

In  accordance  with  suggested  guidelines  for  reporting  results  of 
meta-analytic  reviews  (Wolf,  1986,  pp.  9-65).  95  percent  confidence  intervals 
(CIs)  were  calculated  for  Important  RPB  values.  These  values,  along  with 
relevant  statistical  information,  ere  presented  in  Appendix  I.  For  Table  4, 
only  the  final  transfer  category  incorporated  a  value  of  sero  within  the 
stated  Cl  parameters  indicating  that  the  effect  may  not  be  very  strong. 

Item  and  Response  Oateaorv  Reduction.  The  original  code  sheet  contained 
both  continuous  and  discrete  response  categories  for  describing  research 
characteristics,  These  were  reduced  to  four  continuous  and  six 
discrete  response  categories  based  on  frequencies  and  correlational  analysis. 
These  ten  research  characteristics  were  used  during  subsequent  analysis.  Item 
and  response  category  reduction  is  described  below. 

Response  categories  having  sero  frequency  were  eliminated.  Whan 
possible,  response  categories  were  combined  to  allow  for  meaningful 
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Table  4 


Walghtad  Mean  Point  Blsarlal  Correlation  Coefflolante ,  Mean  Percent 
Negative  Raaearoh  Statiatioa,  and  Number  of  Experiments  by  Result 

Claaeifioation  Category 


Simulator  versus  Alreraft-Onlv  Training  (JETS) 

Result 

Classification 

Category 

Dapendant  Variable 

RPB 

Mean  Paroent 

Negative 

Research  Statistics  N 

(1)  Overall  effect  size 

.26 

8.1 

10 

(2)  Objective  meaeures 

only 

.12 

1.7 

5 

(3)  SubJ active  meaeures 

only 

.25 

10.5 

10 

(4)  Initial  tranafar 

trial 

.19 

0.0 

3 

(5)  Final  tranafar 

trial 

.03 

25.0 

2 

Note .  RPB  rafera 

to 

tha  walghtad  mean  point 

biierial 

correlation 

coefficient  and  N 

refers_£fl  the  total  number  of 

experiments  used  when 

calculating  an  individual  RPB. 

Also,  the  RPB  reported  for  classification 

category  (1)  collapses 

acroae 

tranafar  trial  and  measurement 

rr 

• 

interpretation.  For  example,  two  response  categories  under  "subject 
assignment"  (use  of  matching  and  the  combined  use  of  matching  and  random 
assignment)  were  merged  into  a  single  category  to  directly  assess  the  use  of 
matching  prior  to  subject  assignment  (Table  5).  All  continuous  Items  were 
analyzed  using  correlational  analyses  only. 

There  were  five  result  classification  categories  for  RPB  measures  (see 
Table  k,  (1)‘(5}).  Correlational  and  subgroup  analyses  were  performed 
using  only  the  "overall"  RPB  measure  (category  1),  since  this  measure  was 
calculated  for  all  experiments.  Furthermore,  the  percent  negative  research 
statistic  measure  was  used  only  for  correlational  analysis  for  two  reasons:  1) 
it  is  an  Indirect  measure  of  training  effectiveness;  and  2)  subgroup  analysis 
using  this  measure  is  inappropriate.  As  a  ratio  of  statistical  values  within 
a  given  experiment,  it  precludes  the  u.«ie  of  appropriate  metn-iinalyt  tc 
procedures,  such  os  attaching  weights  (number  of  subjects)  to  indiviclunl 
experiment  outcome  values,  Thus,  calculation  of  a  weighted  mean  value  across 
experiments  is  not  possible, 
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Table  5 


Correlations  Between  Research  Characteristics  and  Dependent  Variables  , 


Dependent  Variables 

Research  Characteristic 

RPB 

Percent  Negative 

Research 

Statistics 

(1)  Use  of  matching  prior  to 

.576 

.018 

subject  assignment 

(10) 

(10) 

p-,041 

P-.480 

(2)  Use  of  CGI  visual  system 

-.391 

,258 

(7) 

(7) 

P-.193 

P-.238 

(3}  Total  FOV  of  visual  system 

.123 

-.055 

(9) 

(9) 

P-.376 

p- . 444 

(4)  DOF  of  motion  system 

.677 

-,185 

(10) 

(10) 

P-.016 

P-.305 

(5)  Use  of  C-seat 

.202 

-.227 

(8) 

(8) 

P-.315 

P-.294 

(6)  Use  of  whole-task  simulator 

.593 

,131 

(10) 

(10) 

P-.035 

P-. 359 

(7)  Use  of  proflclency-based 

.639 

-  .331 

training 

(9) 

(9) 

P" . 032 

P-.  192 

(8)  Having  both  objective  and 

-.772 

.094 

subjective  dependent 

(10) 

(10) 

measures 

p- . 004 

P-, 398 

>v 

(y)  Number  training  hours 

.702 

-.243 

(9) 

(9) 

p-.OlB 

P-. 265 
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Table  5  (continued) 


( 10)'^Nuinber  training  trials  .420  .347 

(3)  (5) 

P-.241  P-.283 

*  indicates  variables  are  continuous, 

NOTE.  RPB  la  weighted  mean  point  blaarlal  correlation  coefficient.  Number  in 
parentheaia  is  the  number  of  experiments  used  to  calculate  Pearson 
correlation  coefficient.  BOLD  print  indicates  correlation  coefficient 
has  ]j<.05. 


Correlations.  Correlations  were  computed  between  Individual  research 
characteristics  and  between  research  characteristics  and  the  two  dependent 
variables  (RFBs  and  percent  negative  research  statistics).  Table  3  presents 
correlations  between  the  ten  research  characteristics  and  the  two  primary 
dependent  measures, 

The  correlation  between  RPB  and  percent  negative  research  statistic 
values  is  negatl^jL-  -.47,  p-.087).  This  negative  relationship  was 
expected,  since  RPBs  for  individual  experiments  would  tend  to  be  higher  whan 
fewer  negative  research  statistics  were  Included  in  the  effect  size 
calculation.  It  follows  that  the  correlations  between  individual  research 
characteristics  and  the  two  dependent  measures  would  take  on  opposite  values. 
This  pattern  was  not  observed  for  three  of  the  ten  research  characteristics 
(see  Table  5,  numbers  (1),  (6),  and  (10))  and  is  important  as  a  criterion 
for  determining  valid  moderator  variables. 

Intercorrelations  among  research  characteristics  from  experiments 
included  in  the  meta-analysis  are  presented  in  Appendix  K.  The  nature  and 
pattern  of  these  may  influence  correlations  between  research  characteristics 
and  RPB  measures  reported  in  Table  S, 

Examination  of  intercorrelations  among  the  research  characteristics  is 
useful  for  understanding  the  relationship  between  these  characteristics  and 
measures  of  training  outcomes  reported  here.  In  particular,  use  of  objective 
and  subjective  measures  when  calculating  training  outcomes  appears  to  be  an 
important  factor  mediating  the  observed  relationship  between  RPB  values  and 
several  research  characteristics.  In  addition,  research  variables  that  are 
indicative  of  the  strength  of  training,  such  as  number  of  training 
hours,  the  number  of  training  trials,  and  use  of  prof iciency-based  criteria 
for  learner  advancement,  were  found  to  be  positively  related  to  RPB  values. 
The  relationship  between  research  characteristics  and  training  outcomes  Is 
explored  further  in  the  next  section  using  subgroup  analyses 

The  six  discrete  research  characteristics  were  included  in  the  sulig.roup 
analyses  involvli\g  jet  experiments.  The  purpose  of  these  analyses  were 
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twofold.  First,  they  provided  statistically  rigorous  tests  for  determining 
which  research  characteristics  were  moderator  variables;  second,  they  provided 
valuable  information  abovit  the  magnitude  of  the  relationship  between  research 
characteristics  and  training  outcomes. 

Four  of  the  six  discrete  research  characteristics  had  subgroupings  formed 
by  partitioning  experiments  into  those  incorporating  the  characteristic  ("yes" 
grouping)  and  those  not  incorporating  it  ("no"  grouping),  The  procedure  for 
conducting  subgroup  analysis  with  these  variables  was  Identical  to  that 
presented  in  the  overview  section  described  earlier  (see  Hunter  et  al . ,  1982), 
with  one  exception.  That  is,  a  valid  moderator  should  exhibit  a  reduction  in 
the  "true"  variance  for  the  "yes"  group,  relative  to  that  of  the  whole  group. 
The  "no"  group  may  or  may  not  exhibit  a  reduced  variance,  since  experiments 
falling  into  this  category  could  be  heterogeneous  in  nature,  This  particular 
procedural  variation  was  used  by  Dickinson  et  al ,  (1986,  p,  38)  in  thi.  .r 
meta-analytic  review  of  work  performance  ratings.  The  two  research 
characteristics  not  involving  a  yes/no  dichotomy  required  a  reduction  of 
variance  for  each  of  the  response  categories.  They  are  the  use  of  blocked  or 
proficiency-based  training,  and  the  use  of  a  whole  or  part-task  simulator.  To 
summarize,  the  criteria  for  determining  if  a  research  characteristic  was  a 
valid  moderator  were:  1)  the  variable  produced  correlations  with  the  two 
dependent  measures  having  opposite  signs  (see  Table  5);  and  2)  the  subgroup 
analysis  produced  a  reduction  in  the  "true"  variance  for  Individual  subgroups 
relative  to  the  that  of  the  whole  group. 

Results  of  the  subgroup  analysis  are  presented  in  Appendix  J .  Two 
discrete  research  characteristics  failed  to  meet  the  first  criterion  stated 
above:  namely,  use  of  matching  prior  to  group  assignment  and  use  of  a  whole  or 
part-task  simulator.  In  addition,  only  one  of  the  ten  experiments  reported 
using  a  G-seat,  thus  precluding  subgroup  analysis  for  this  variable. 

Input  Variables 

Task  Requirements .  Since  it  has  been  suggested  that  simulator  training 
works  better  for  some  tasks  than  others  (Orlansky  A  String,  1977;  Semple, 
Hennessy,  Sanders,  Cross,  Beith  &  McCauley,  1981),  a  grouping  of  experiments 
by  task  type  was  made.  Appendix  L  presents  results  of  the  subgroup  analysis 
based  on  three  tasks;  takeoffs,  approaches,  and  landings.  These  tasks  were 
chosen  because  relevant  statistical  informaiion  about  each  task  was  presented 
in  three  separate  reports.  Analysis  based  on  task  type  was  not  performed 
because  requisite  task-specific  information  was  not  included  in  most  reports. 

The  results  of  the  subgroup  analysis  for  tasks  indicated  a  substantial 
improvement  of  training  outcome  (RPB)  measures  for  these  throe  tasks  relative 
to  that  of  average  simulator  training  outcomes  (RPBs  equal  .65,  .64,  and  .57 
for  takeoffs,  approaches,  and  landings,  respectively).  It  is  important  to 
note  that  these  results  are  based  on  information  from  three  experiments  and 
that  only  the  approach  (to  landing)  task  realized  variance  reduction  relative 
to  the  whole  group.  In  addition,  there  was  considerable  variance  left 
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unaccounted  (greater  than  40  percent)  for  both  takeoff  and  landing  tasks , 
thus,  leaving  opeTi  tVie  question  of  potential  moderator  variables  for  these 
tasks , 

Trainee  Characteristics.  Differences  between  trainees  were  rarely 
studied.  Plight  experience  received  mention  In  some  experiments  as  a  possible 
variable,  but,  there  was  insufficient  Information  on  any  trainee 
characteristics  to  perform  an  analysis. 


Simulator  Desian.  Subgroup  analysis  Indicated  that  use  of  a  CGI  visual 
system  was  not  a  valid  moderator  variable,  since  separating  experiments 
according  to  this  feature  did  not  produce  a  reduction  In  the  true  variance  In 
accordance  with  the  prescribed  criterion,  Both  correlational  and  subgroup 
analysis  Indicated  that  use  of  a  CGI  visual  system  may  produce  below  average 
training  outcomes  (RPBs  are  .20  and  .26  for  Jet  experiments  using  a  CGI  visual 
system  and  for  Jet  experiments  overall,  respectively).  The  effects  of  using  a 
G'seat  or  G-sult  were  not  conclusive,  since  only  one  experiment  reported  using 
a  G'seat,  and  use  of  G>suits  was  not  addressed  by  most  experiments.  Training 
differences  based  on  the  use  of  a  whole  or  part* task  simulator  could  not  be 
determined,  since  only  two  experiments  used  a  part -task  simulator  for 
training. 


Table  6  presents  RPB  values  and  mean  percent  negative  research  statistics 
for  motion  experiments.  Differences  can  be  found  by  comparing  these  results 
with  experiments  Investigating  simulator  training  per  se,  described  In  the 
previous  section  (see  Table  4).  First,  as  noted  before,  the  RPB  value  for 
motion-based  experiments  (-.03)  was  substantially  different  from  that  reported 
for  simulator  versus  aircraft  training  experiments  (.26).  The  negative  RPB 
value  Implies  that  motion-based  training  may  be  detrimental  to  training 
outcomes,  compared  to  fixed-base  simulator  training.  This  result  was  also 
reflected  In  percent  negative  research  statistic  values  (.08  versus  .44  for 
simulator  training  and  motion  experiments,  respectively). 


In  summary,  these  data  support  previous  research  Indicating  that  use  of 
motion  simulation  for  Jets  does  not  consistently  produce  greater  training 
outcomes  relative  to  simulator  training  without  motion  (Martin,  1981;  Orlansky 
&  String,  1977).  Over  40  percent  of  research  statistics  comparing  simulator 
training  with  and  without  motion  favor  training  without  motion  (see  Table  6). 


Training  Context.  The  type  of  training,  proficiency  or  blocked,  was 
demonstrated  to  have  a  moderating  effect  on  the  training  results,  Experiments 
that  incorporated  a  proficiency  criterion  for  advancing  trainee.s  produced 
con.sistent,  sizable  improvements  In  training  outcomes  compared  to  those 
incorporating  blocked  training,  and  compared  to  overall  Jet  traini.ng  outcomes 
(RPBs  are  .54,  .21,  and  .26,  respectively).  This  result  was  supported  by 
correlational  analysis. 
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Table  6 

Weighted  Mean  Point  Bieerlal  Correlation  Coefficients,  Mean  Percent 
Negative  Research  Statistics,  and  Number  of  Experiments  by  Result 

Classification  Category 


Motion  versus  Mo  Motion  Experiments  (JETS) 


Dependent  Variable 


Level  of  Research 
Characteristic/ 

Result  Classification 
Category 

RPB 

Mean  Percent 
Negative 

Research  Statistics 

N 

(1)  Overall  effect  size 

-.05 

44.0 

5 

(2)  Objective  measures 
only 

.11 

32.0 

3 

(3)  Subjective  measures 
only 

-.04 

34.6 

5 

(4)  Initial  transfer 
trial 

.12 

16.5 

2 

(5)  Final  transfer 
trial 

N/A 

N/A 

• 

Note ■  RPB  refers  to  weighted  mean  point  biserial  correlation  coefficient 
and  N  refers  to  the  total  number  of  experiments  used  when  calculating  an 
individual  RPB,  Also,  the  RPB  reported  for  classification  category  (1) 
collapses  across  transfer  trial  and  measurement  type. 


Experiments  incorporating  both  objective  and  subjective  evaluation 
reported  training  outcomes  of  lower  magnitude  than  those  using  only  subjective 
measures.  This  relationship  appeared  to  influence  the  observed  correlation 
between  RPB  values  and  several  other  research  characteristics. 

It  was  also  expected  that  experiments  having  greater  numbers  of  training 
hours  would  produce  higher  training  outcomes.  The  correlation  for  training 
hours  with  RPB  measures  (£-  .70)  was  positive  and  significant.  It  should  be 
noted  that  total  training  hours  and  total  training  trials  were  calculated  by 
collapsing  across  all  task  boundaries  within  a  given  experiment.  A  more 
meaningful  measure  would  have  been  to  calculate  the  average  number  of  training 
hours  or  trials  per  task,  but  information  needed  to  calculate  per -task 
averages  was  not  given  In  most  experiments. 
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DISCUSSION  AND  OONCLUSIONS 

OVERVIEW 

Review  of  the  aviation  training  effeotiveneis  reaearch  clearly  shows  that 
relatively  few  of  the  potential  moderating  variables  have  been  incorporated 
into  flight  simulation  experiments.  Some  of  these  variables,  especially  those 
involving  experiment  quality,  have  a  significant  Influence  on  how  experimental 
results  are  Interpreted  and  may  affect  the  magnitude  of  training  outcomes 
(see  Appendix  C) .  Of  those  variables  that  have  been  incorporated  into  this 
research,  this  meta-analysis  found  several  that  have  a  clear  moderating 
effect  on  training  transfer. 

Sizable  differences  in  the  effectiveness  of  simulation  training  were 
found  between  Jets  and  helicopters.  This  section  focuses  on  each  of  the  topic 
areas  of  the  meta-model  (Figure  1)  used  as  the  framework  for  developing  the 
code  sheet.  Important  issues  within  each  area  are  addressed,  and  results  from 
previous  reviews  are  presented.  This  discussion  also  provides  a  research 
agenda  which  can  be  used  both  to  guide  future  reaearch  efforts  in  the  flight 
simulation  training  area,  and  to  suggest  ways  of  documenting  efforts  to 
maximize  their  value  for  future  meta-analytic  reviews. 

INPUT  VARIABLES 

TABKjQtttPlMnW 

Not  all  flight  simulators  and  training  systems  incorporating  these 
simulators  are  the  same.  That  this  dissimilarity  extends  to  the  training 
effectiveness  of  the  simulators  is  supported  by  the  results  presented  here.  In 
particular,  there  are  dramatic  differences  in  both  the  magnitude  and  pattern 
of  training  outcomes  for  Jet  and  helicopter  simulator  training  systems.  For 
Jets,  simulator  training  outcomes  have  been  consistent  and  positive.  That  is, 
comparisons  between  pilots  trained  in  Che  aircraft  only,  and  those  trained  on 
a  simulator  and  che  aircraft,  consistently  favored  the  latter  group  (RPB- 
.26).  This  pattern  is  true  across  a  variety  of  task  boundaries.  For 
helic.jpters ,  che  accumulated  difference  between  simulator  and  aircraft  only 
trained  pilots  was  quite  small  (RFB-  .02).  Over  40  percent  of  the 
experimental  comparisons  favored  the  aircraft-only  trained  group  (compared  to 
eight  percent  for  Jets) .  Experiments  directly  assessing  use  of  simulator 
motion  indicated  motion  cuing  did  not  improve  training  for  Jots  (see  Table  3, 
A2).  Results  from  the  sole  experiment  included  in  this  review  assessing  the 
effects  of  motion  cuing  for  helicopters  (McDaniel  et  al , ,  1983)  indicated  that 
certain  helicopter  tasks  may  benefit  from  motion  cues. 

The  observed  differences  in  experimental  results  for  these  two  aircraft 
types  necessitated  separate  analysis  and  reporting  of  results  of  Jet  and 
helicopter  experiments.  The  limited  number  of  helicopter  experiments  that 
could  be  Included  in  this  review  precluded  any  in-depth  analysis  aimed  at 
Identifying  moderator  variables  in  this  area.  For  these  reasons,  the 
following  discussion  is  confined  to  training  Involving  Jet  aircraft,  except 
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where  otherwise  stated.  The  subject  population  targeted  for  this  review  was 
novice  Jet  pilots;  specifically,  recent  Undergraduate  Pilot  Training  (UPT) 
graduates  or  current  trainees.  Thus,  results  of  this  me ta- analysis  are  not 
generalizable  to  transition  pilots  with  prior  Jet  experience  or  pilots  having 
extensive  prior  simulator  experience. 

Ililt  RtflUlununti 

Previous  reviews  (Orlansky  &  String,  1977;  Semple  et  al.,  1981) 
suggested  that  certain  "basic"  aircraft  control  tasks  (e.g,,  approach  and 
landing)  appeared  to  transfer  much  better  than  more  complex  tasks  (e.g., 
formation  flight  maneuvers).  Since  no  existing  task  taxonomy  was  appropriate 
for  the  tasks  In  this  domain,  an  attempt  was  made  to  develop  task  categories 
basod  on  difficulty  ratings  assigned  to  tasks  by  novice  and  experienced 
pilots,  Appendix  M  presents  the  rating  form  used  to  collect  task  difficulty 
information  for  Jet  aircraft.  This  effort  was  only  partially  successful. 
There  are  several  factors  that  must  be  considered  before  task  difficulty 
information  can  be  a  useful  metric.  For  example,  task  difficulty  is  relative, 
so  tasks  trained  early  in  the  training  program  (e.g-.  descending  turns)  may 
seem  difficult  until  one  attempts  more  complex  casks  during  a  later  training 
phase  (e.g.,  carrier  landings).  Thus,  only  a  few  casks  were  rated  in  a 
consistent  manner  along  a  continuum  ranging  from  "low"  to  "high"  difficulty, 

Individual  research  reports  were  examined  for  inclusion  of  the  same  task 
or  set  of  tasks,  with  training  outcome  data  reported  for  individual  tasks.  Co 
allow  for  calculation  of  a  cumulative  RPB  value  for  each  task.  Three  casks 
were  found  that  mat  these  criteria;  normal  tgliftoffs,  approaches,  and  landings 
(excluding  carrier  landings) .  C\ukulative  RPB  values  were  over  two  times 
greater  for  these  tasks  (RPBs-  .65,  .64.  and  .57,  respectively)  relative  to 
the  overall  value  for  Jet  training  (RPB-  ,26). 

These  results  underscore  the  need  for  reporting  task-specific  training 
outcome  information  in  future  research  efforts.  Without  the  Inclusion  of  such 
data,  as  well  as  detailed  information  about  simulator  fidelity  and 
configuration  parameters,  future  meta-analytic  reviews  will  not  be  able  to 
quantify  performance  outcome  tradeoffs  for  varying  fidelity  levels  of 
specified  simulator  subsystsms. 

Trainee  Charectarlstics 

Student  pilots  bring  with  them  into  the  learning  environment  different 
aptitudes,  abilities,  and  prior  experiences.  These  factors  can  Influence  the 
amount  and  rate  of  knowledge  acquisition,  retention,  and  transfer  of  training, 
Taken  together,  these  factors  comprise  what  Is  commonly  referred  to  as 
individual  differences. 

In  their  review  of  Individual  differences  within  military  training 
environments,  Hogan,  Arneson,  and  Salas  (1987)  cite  evidence  suggesting  that 
individual  difference  factors  may  account  for  a  portion  of  the  variance 
associated  with  simulator  training  outcomes.  For  example,  Federico  (1982) 


44 


Taohnical  R«port  89*006 


presented  findings  indicating  that  even  when  training  programs  incorporate 
mastery -leva I  criteria  for  advancing  or  terminating  training,  differences 
between  subjects'  performance  are  still  noticeable.  Flammer  (1976)  reported 
that  mastery  training  did  not  reduce  individual  differences  in  learning  time 
within  a  givsn  mastery  unit  (see  also  Arlin,  1984). 

Motivation  is  an  individual  factor  thought  to  have  considerable  influence 
for  both  initial  skill  acquisition  and  for  subsequent  transfer  to  the 
operational  environment  (AGARD  Report,  1980).  The  authors  of  the  1980  A,GARD 
report  considered  understanding  and  resolving  motivational  issues  to  be  the 
key  to  maximizing  training  outcomes.  Since  motivation  plays  an  important  role 
in  current  theories  of  learning  (e.g.,  Bandura  &  Valters,  1963;  Gagne',  1985; 
Skinner,  1953)  and  instructional  development  (e.g.,  Dick  6i  Carey,  1978),  it 
may  be  considered  to  influence  all  phases  of  simulator-based  training,  from 
device  design  to  evaluation  of  training  performance.  Motivational  issues 
include  both  the  acceptance  of  the  simulator  as  a  valid  training  device,  and 
the  design  of  training  that  Involves  and  challenges  the  student. 

There  Is  evidence  that  students  may  lose  motivation  after  prolonged 
simulator  training  simply  because  they  would  rather  begin  training  in  the 
aircraft  (Pohlmann  &  Reed,  1978,  p,  8).  Despite  the  possible  influence  of 
motivation,  the  flight  training  research  included  very  little  pertinent 
information  about  it.  Formal  assessment  of  instructor  and  student  acceptance 
of  a  given  simulator  was  rarely  attempted  (see  e.g,,  Reed  6  Reed,  1979), 
although  anecdotal  information  was  given  in  a  few  reports. 

One  other  attempt  was  made  to  investigate  the  effects  of  motivation 
within  training.  Performance  feedback,  in  the  form  of  knowledge  of  results 
(KOR) ,  has  been  shown  to  have  motivational  properties  (Kulhavey,  1977; 
Kulhavey,  White,  Topp,  Chan,  &  Adams,  1985),  It  was  thought  that  differences 
in  how  KOR  was  given,  either  instructor  generated,  device  generated,  or  a 
combination  of  both,  would  influence  performance  outcomes.  Unfortunately, 
only  three  reports  clearly  specified  how  performance  feedback  was  given  (Gray 
&  Fuller,  1977;  Payne  et  al.,  1976;  Westra,  Lintern  6  Wlghtman,  1986).  This 
number  was  insufficient  for  meaningful  analysis. 

In  general,  researchers  within  the  flight  simulation  training  area  have 
treated  individual  differences  between  learners  as  a  potential  source  of 
error  variance.  Seven  of  the  ten  moot  common  types  of  Jet  experiments 
included  in  this  review  used  a  matching  procedure  to  equate  subjects  prior  to 
group  assigtunent.  The  matching  variables  used  most  frequently  were  overall 
UPT  scores  and  previous  number  of  flight  hours. 

Use  of  matching  prior  to  subject  assignment  correlated  positively  with 
both  RPB  and  percent  negative  research  statistic  values  (I’s-  ,58  and  .02, 
respectively).  However,  since  the  pattern  of  results  of  the  correlational 
analysis  for  this  variable  did  not  conform  to  that  prescribed  for  selecting 
variables  for  additional  (subgroup)  analysis,  it  was  eliminated  as  a  potential 
moderator  variable,  and  no  further  analysis  was  done. 
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THROUGHPUT  VARIABLES 

SAmyUtgi  Ptilgn 

One  of  the  goala  of  thla  analysle  wee  to  identify  elmuletor  fidelity 
configuration  parametera  that  optimise  training  outcomes.  To  accomplish  this 
goal,  simulators  were  viewed  In  terms  of  Individual  subsystems,  and  when 
possible,  an  attempt  was  made  to  evaluate  separate  components  or  design 
features  within  a  given  subsystem,  For  example,  the  motion  and  force  cuing 
subsystem  was  separated  Into  use  of  0*seat  and  DOF  of  the  platform  motion 
apparatus.  Similarly,  evaluation  of  the  visual  image  generation  subsystem 
Involved  separate  analysis  of  FOV  parameters  as  well  os  use  of  CGI 
technology,  At  a  more  global  level,  analysis  was  performed  based  on  whether 
the  simulator  was  considered  a  whole  or  part* cask  device.  Lack  of 
variability  In  the  reported  use  of  other  design  features  precluded  any 
attempts  at  analysis.  These  Included  use  of  G«sulc  force  cuing,  sound 
simulation,  "stick  shaker"  system.  Instructional  and  environmental  features, 
as  well  as  the  type  of  procedure  used  to  validate  the  flight  control 
characteristics. 

The  gathering  of  Information  for  analysis  on  fidelity  of  simulation  was 
hampered  by  two  factors!  the  deficiency  of  detailed  reporting  of  simulator 
configuration  parametera,  and  the  lack  of  a  validated  taxonomy  of  flight 
tasks.  The  latter  compelled  the  assessment  of  fidelity  Information  on  a 
task‘by*task  basis.  Although  several  task  taxonomies  have  been  developed  (see 
e.g.,  Wheaton  et  al.,  1976;  Fleishman  &  Qualntanoe,  1984),  none  were  found  Co 
be  appropriate  for  use  In  aviation  tasks. 

The  findings  reported  here  are  only  an  initial  step  toward  fulfilling  Che 
goal  of  extracting  emplrlcally*baaad  design  guidance  principles,  because 
detailed  Information  on  the  training  device  used  Is  not  routinely  reported  In 
research  reports.  As  a  result,  the  level  of  analysis  possible  from  available 
information  may  he  too  global  to  be  of  immediate  use  by  engineers  and  other 
simulator  design  epeciaiisCs. 

Visual  Simulation.  The  only  two  variables  used  to  evaluate  visual 
imaging  systems  were  total  FOV  of  the  system  and  use  of  CGI  technology.  A 
more  thorough  evaluation  such  as  determination  of  configuration  requirements 
would  be  more  helpful  but  is  not  possible  from  the  information  available. 
Neither  total  FOV  nor  the  use  of  CGI  technology  was  found  to  have  a 
moderating  effect  on  training  outcomes.  This  is  consistent  with  findings 
reported  elsewhere  (Semple  et  al.,  1981;  Woodruff,  Smith,  Fuller,  &  Weyer, 
1976) , 

This  result  should  not  be  taken  Co  indicate  that  visual  imaging  is  an 
unimportant  factor  in  simulator  training.  Recent  surveys  of  engineers  and 
other  training  specialists,  to  determine  human  perception  and  performance 
information  needed  when  making  design  decisions,  found  visual  and  motion 
sir  lint  n  p  eas  were  the  two  most  frequently  stated  areas  of  need  (Klein  A 
Brezovic,  lv87;  Rouse,  1983),  Additionally,  Che  AGARD  (1980)  report 
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oonoludad  th«c  "...  With  £«w  axoaptioni,  th*  ovarwhalming  finding  has  been 
Chat  visual  tasks  learned  In  the  simulator  show  positive  transfer  to  the 
aircraft"  (p.  9).  Finally,  visual  imaging  technology  is  far  superior  today  to 
that  used  to  produce  the  visual  syatems  of  simulators  uaed  In  this  review. 
For  these  reasons,  continued  evalxtatlon  of  current  training  systems  that 
Incorporate  this  technology  Is  warranted. 

Motion  and  Force  filmulatlon.  Considerable  Interest  and  attention  has 
been  placed  on  the  utility  of  simulator  motion  cuing  for  facilitating  skill 
acquisition  and  transfer.  In  general,  results  of  this  meta- analysis  support 
the  previous  reviews  which  Indicate  motion  cuing  adds  little  to  the  training 
environment  (Martin,  1981;  Hays  &  Singer,  1989;  Orlansky  &  String,  1977). 
The  cumulative  effect  else  value  across  the  five  motion  versus  no-motlon 
experiments  included  In  the  meta-analysls  was  negative  in  value  (RPB-  -.OS), 
indicating  that  motion  may  detract  frcm  training,  at  least  for  some  casks. 

These  results  are  Inconsistent  with  findings  from  a  recant  review  of  the 
flight  simulation  evaluation  literature  by  Pfeiffer  and  Horey  (1987).  There 
are  several  obvious  differences  between  the  Pfeiffer  and  Horey  (1987)  review 
and  this  review  that  help  to  explain  these  contradictory  findings.  First, 
Pfleffer  and  Horey  used  as  their  research  effect  size  metric,  a  transfer 
effectiveness  ratio  (see  also  Hays  &  Singer,  1989,  pp.  133-134),  This  measure 
Is  highly  dependent  on  the  length  of  training  on  Che  simulator.  Second, 
although  their  approach  was  described  as  "meta-analysls"  (p.  15),  they  did  not 
Incorporate  commonly  accepted  mata-analytlc  methodology,  such  as  weighting 
Individual  research  outcomes  by  their  corresponding  sample  size,  or  providing 
a  dstalled  explanation  describing  their  rationale  for  declslons/prooedures 
used.  Finally,  the  comparative  analysis  upon  which  they  concluded  the 
superiority  of  simulator  training  with  motion  cuing  Involved  use  of  a  ^-test 
(p.  39).  This  statistical  procedure  is  not  appropriate  where  the  underlying 
means  and  standard  deviations  were  derived  by  collapsing  across  experiments 
(not  subjects),  end  thereby  calls  Into  question  the  nature  of  the  distribution 
upon  which  the  critical  value  of  the  statistic  Is  based. 

Evidence  indicating  that  motion  cuing  adds  little,  or  nothing,  to  the  jet 
simulator  training  environment  cannot  be  considered  definitive,  There  are  two 
important  reasons  for  questioning  these  results.  First,  calibration  of 
critical  motion  cuing  system  parameters  (e.g.,  control  input  response  times, 
leg  extension  acceleration  rates)  was  rarely  attempted.  Only  one 
motion-related  experiment  Included  in  this  review  reported  results  of 
calibration  tests  prior  to  experimentation  (McDaniel  et  al.,  1983),  Since  a 
similar  issessment  was  not  done  during  or  after  the  experiment,  the 
poBsibilicy  of  software  or  hardware  failure  during  the  course  of  the  research 
is  a  cogent  argument  for  training  outcomes  on  certain  tasks  favoring  the 
no-motlon  trained  group.  Incorporating  appropriate  methodological  procedures, 
such  as  periodic  calibration  checks,  is  crucial  for  producing  unequivocal 
results  in  this  area. 

A  second  reason  for  questioning  results  of  motion  versus  no-motion 
experiments  is  due  to  the  inclusion  of  several  training  tasks  in  each 
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experiment.  It  has  been  argued  that  motion  affects  vary  front  task  to  tusk 
depending  on  the  primacy  of  motion  cues  for  performing  critical  aspects  of  the 
task.  Since  reports  often  collapse  across  task  boundaries  when  making 
between-group  comparisons,  possible  apeciflc  effects  from  motion  cuing  may  be 
inadvertently  masked.  Generally,  reports  do  not  distinguish  between  the  kinds 
of  motion  provided, 

Another  issue  when  considering  motion  was  addressed  by  Gundry  (1976)  who 
distinguished  between  maneuver  motion  and  disturbance  motion,  the  former 
resulting  from  aircraft  control  Inputs  of  the  pilot  and  the  latter  from 
environmental  conditions,  suoh  aa  wsather  turbulence  or  mechanical 
malfunction,  Quiidry  reasoned  that,  wharaas  providing  motion  cues  related  to 
disturbance  would  benefit  simulator  training,  incorporating  maneuver  motion 
cues  would  not  (see  also  Martin,  1981;  DeBerg,  McFarland,  &  Showalter,  1976). 
Information  on  the  type  of  motion  used  in  experiments  was  not  reported,  so  the 
difference  oould  not  be  evaluated, 

In  contrast  to  motion  experim'ints  Involving  Jets,  a  similar  experiment 
using  heliecioters  (McDaniel  at  al,,  1983)  produced  a  positive  overall  training 
outcome  (RFB**  ,21),  This  result  must  be  tempered  by  the  fact  that  information 
from  a  single  experiment  was  used  to  derive  the  cumulated  RPB  metric  and 
methodological  problems  cited  above  apply  to  this  experimont.  In  this  regard, 
a  close  uxaminatlon  of  experimental  outoomee  from  the  McDaniel  at  al (1983) 
experiment  indicates  noticeable  differences  in  the  direction  of  training 
outcomes  for  certain  tasks,  Specifioally,  positive  training  outcomes  (i,e,, 
instances  where  the  motion  group  outperformed  the  no-motion  group)  were 
realized  on  three  t-isks:  Aircraft  Stablization  Equipment  (ASE)  off,  free- 
screain  recovery,  and  coupled  hover  (RPBs*  ,19,  ,37,  and  ,^5,  reepeotlvely) , 
For  all  ocher  tasks,  including  takeoffs,  approaches,  and  landings,  motion 
cuing  was  aasoclated  with  negative  training  outcomes.  This  pattern  of  results 
indicates  that  motion  cuing  may  aid  only  certain  training  tasks.  Results  from 
additional  experiments  of  this  kind  must  be  added  to  these  before  conclusions 
concerning  task-specific  motion  effects  can  be  made  with  any  degree  of 
confidence , 

Treinina  Context 

Some  previous  reviews  of  the  flight  simulation  area  have  stressed  that  a 
systems  approach  be  taken  when  evaluating  simulation  training  (AGARD  Report, 
1980;  Hays  6i  Slngar,  1989;  Rose,  Wheaton  5«  Yates,  1985;  Semple  et  nl .  ,  1981; 
Wlieaton  et  al . ,  1976).  According  to  Che  systems  approach,  the  simulator  is 
never  a  stand-alone  item  being  evaluated,  but  must  be  considered  along  with 
other  relevant  features  of  the  training  milieu,  such  as  curricula,  scheduling, 
staffing,  and  the  use  of  specific  training  procedures.  The  AGARD  (19B0) 
report  concluded  that,  "...how  the  device  is  used  can  influence  its 
effectiveness  to  an  equal  or  greater  extent"  relative  to  that  expected  by 
appropriately  matching  the  simulator  to  task  parameters  (p.  9). 
Unfortunately,  the  accumulated  research  yielded  little  data  on  this  Issue. 
ThLs  meta-analysis  provided  information  on  only  two  areas  within  the  training 
context:  training  type  and  performance  measurement. 
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Training  Tvpa .  U««  of  training  procadurei  that  accommodated  individual 
learner  needs,  auch  ai  thoie  aaaoolatad  with  proflcienoy-baaed  training,  were 
found  to  be  more  effective  than  prooedurei  which  allocated  a  fixed  amount  of 
training  (RPBa  were  ,54  and  .21,  reapeotively) ,  The  latter  training  is 
commonly  referred  to  as  blocked,  look*atap,  or  fixed  time/triale  training,  The 
reaulta  of  both  correlational  and  subgroup  analysis  reported  here  clearly 
indicate  that  proficiency  training  raaulted  in  greater  training  transfer  to 
the  operational  environment  than  transfer  from  blocked  training. 

Performance  Measurement .  The  majority  of  research  has  used  Instructor 
ratings  to  measure  transfer  performance.  Despite  wide  use  of  these  ratings, 
problems  associated  with  their  use  were  mentioned  in  several  reports;  no 
inter  rater  reliability  estimates  were  reported,  and  intra-rater  reliability 
estimates  were  reported  in  only  one  experiment  (Westra,  Lintern,  Sheppard, 
Thomley,  Mauk,  Wlghtman,  &  Chambers  1986,  p.  4/).  The  widespread  occurrence 
of  omitting  Inter-  and  intra-rater  reliability  information  is  problematic, 
since  true  dlfferenoee  In  performance  cannot  be  Inferred  unless  the  measures 
used  to  rata  the  performance  are  reliable  (Cook  and  Campbell,  1979), 

Objective  measures  of  differences  in  pilot  performance  due  to  training 
variables  were  consistently  lower  than  subjective  measures.  This  finding  is 
opposite  to  that  reported  by  Semple  et  al,  (1981,  pp,  31-32),  This 
discrepancy  la  due  to  differences  in  how  the  two  reports  discriminated  between 
subjective  and  objective  measures,  and  to  the  inclusion  of  different  research 
reports  in  the  two  analyses,  Trlals-to-profieiency  measures,  when  proficiency 
is  based  on  instructor  ratings,  are  considered  subjective  measures  in  this 
report.  In  their  review,  Semple  et  al .  (1981)  identified 
trials- to-profiuiency  data  reported  by  Browning,  Ryan,  and  Scott  (1978)  as 
objective  measures  (see  Browning  et  al.,  1981,  Table  4,  p,  17).  There  is  also 
little  overlap  in  the  experiments  included  in  the  two  revitiwe,  Of  the  five 
experiments  included  in  the  Semple  et  al .  ,  (1981)  review,  tni.-ee  were  not 
Incorporated  into  this  meta-nnalytic  review  (see  Appendix  A) . 
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AGENDA  FOR  FUTURE  RESEARCH  IN  THE  FLIGHT  8IMUUT0R  TRAINING  AREA 
OVERVIEW 

A  quantitative  literature  review  ehould  provide  a  eummary  of  the 
empirical  findings  and  knowledge  gaps  where  future  research  efforts  should 
foous,  It  is  apparent  from  this  meta-analytic  review  that  many  important 
flight  simulator  training  factors  have  yet  to  be  addressed  in  a  systematic 
fashion.  The  reeearoh  agenda  that  follows  is  separated  into  the  topic  areas 
used  as  a  framework  for  this  report,  input  variables  and  throughput  variables. 

INPUT  VARIABLES 

This  meta- analysis  has  shown  that  there  are  major  differences  between  jet 
and  helicopter  training  effects,  Additional  research  is  needed  to  further 
specify  these  differences  and  determine  the  training  methods  that  will  provide 
maximum  effectiveness  for  each  type  of  simulator, 

luk,  .Riflutgimnti 

Subgroup  analysis  reported  here  indioates  that,  simulator  training  does 
not  provide  equal  benefit  to  all  aviation  tasks.  Detailed  descriptions  of 
skills  needed  to  perform  tasks  within  a  given  training  program  are  available 
(sect  e.g.,  Payne  et  al,,  1976),  yet  these  descriptions  have  not  led  to  a  valid 
taxonomy  for  grouping  aviation  tasks. 

As  noted  previously,  research  programs  in  this  area  would  benefit  greatly 
if  task  categories  (taxonomies)  could  be  developed  and  validated.  This  would 
allow  generalization  of  results  from  single  tasks  to  task  groupings,  The 
search  for, a  valid  taxonomy  for  aviation  tasks  is  critical  for  avoiding  costly 
duplication  of  future  reeearch  efforte.  There  appear  to  be  several  reasons 
why  aviation  tasks  are  not  readily  clasalfiad  into  well  defined  groups.  One 
reason  is  that  activities  and  ekllle  needed  to  correctly  perform  the  various 
tasks  differ  considerably.  In  many  instances,  psychomotor  performance  Is 
required  In  addition  to  cognitive  decle ion- making  ekllle ,  Another  reason  Is 
that  as  the  student  pilot  acquiree  more  flight  hours  and  masters  successive 
tusks,  his  reliance  on,  and  use  of,  various  informational  sources  may  shift, 

Fleishman  and  Qualntance  (1984)  discuss  a  number  of  ",,, descriptive 
schemes  using  behavior  requirements  as  s  basis  for  the  classification  o£  human 
task  performance"  (p.  127).  While  several  of  these  schemes  appear  suitable 
for  classifying  aviation  tasks,  much  work  needs  to  be  done  before  they  can  be 
applied  to  aviation  tasks  with  any  degree  of  confidence,  Validation  of  any 
task  classification  system  depends  on  evallablllty  of  detailed  performance 
outcome  information  for  Individual  tasks.  For  this  reason,  future  simulator 
training  research  should  provide  detailed  training  outcome  information  for 
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Individual  tasks.  This  information  will  allow  researchers  to  apply 
appropriate  statistical  procedures  (e.g.,  multivariate  analysis)  in  order  to 
empirically  validate  task  clusters. 

The  process  required  to  validate  an  aviation  task  taxonomy  may  take  an 
extended  period.  Research  In  other  areas  may  offer  short-term  payoffs  In 
terms  of  empirically -derived  training  guidance,  One  area  needing 
investigation  involves  determining  the  simulator  instructional  features  that 
will  Improve  training  outcomes  for  specific  aviation  tasks. 

Within  the  flight  simulation  training  domain,  trainee  characteristics 
make  up  what  are  oonsldared  Individual  differences  and  are  usually  viewed  as  a 
source  of  measurement  error.  Within  experiments  used  In  this  review,  most 
equated  subjects  via  matching  variables  (DPT  scores  or  number  of  years 
flying)  prior  to  assigning  them  to  either  Che  experimental  or  control  group. 

In  their  review  of  Individual  differences  in  military  training 
environments,  Hogan  et  al.  (1907)  cite  evidence  Indicating  that  Individual 
learners  bring  with  them  into  the  training  environment  cognitive  ahd 
non-cognltlve  factors  that  influence  training  outcomas.  Training  programs  may 
be  ousComlzed  to  match  individual  learners  in  such  areas  as  learning  styles, 
cognitive  strategies,  and  ssnaory  modalltlea  (Goodman,  1976),.  The  undcrlylhg 
assumption  for  designing  customized  .training  programs  Is'  that  Individual 
learners  vary  in  thelt  approach  to  understanding  and  ramtmberlng  new 
Information.  Since,  for  a  giveh  Individual,  preferred  methods  of  learning  are 
thought  to  be  linked  to  the  learner's  interests,  abilities,  aptitudes,  and 
motivations,  training  programs  may  facilitate  or  Inhibit  the  learning  process. 
Thus,  one  area  that  needs  sttsntlon  is  the  development  of  useful  methods  for 
determining  a  learner's  cognitive  and  non-cognltlve  capabilities.  Hogan  et 
al.  (1986)  review  neveral  meaaurement  batteries  useful  for  determining  a 
person's  learning  style  or  cognitive  abilities  (see  also  Su,  1984). 
Non-cognltlve  factors,  such  as  peraonallty,  affective  adjustment,  or  physical 
ability,  can  also  be  used  to  predict  training  success  and  may  he  used  in 
addition  to  cognitive  ability  measures  to  enhance  their  predictive  properties 
(Hogan  et  al , ,  198b) . 

Individuals  differ  In  the  level  of  motivation  they  bring  to  training  and 
also  In  how  well  the  training  program  can  motivate  them.  One  necessary  arcta 
of  research  is  how  to  promote  acceptance  of  the  simulator  for  training. 

A  second  area  of  motivation  Investigation  Involves  performance  feedback 
Ln  the  form  of  knowledge  of  results  (KOR) .  KOR  may  be  generated  by  several 
sources  within  the  simulator  training  environment.  Including  the  device 
(e.g.,  hardcopy  printout  of  flight  maneuver  elements)  and  the  Instructor 
(e.g.,  verbal  debrief).  Future  research  projects  should  Investigate  the 
relative  effects  on  training  outcomes  for  each  of  these  sources  of  KOR  or  the 
combined  use  o.f  these  sources.  In  addition,  timing  and  amount  of  information 
Inherent  In  the  KOR  have  been  found  to  Influence  performance  in  the 
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psychological  and  educational  training  literature  (see  Kulhavey,  1977; 
Anderson,  Kulhavey,  &  Andre,  1972)  and  should  also  be  Investigated.  Even  if 
KOR  is  not  manipulated  within  the  experiment,  detailed  reporting  of  this 
information  in  future  research  projects  will  allow  subsequent  me ta* analytic 
reviews  of  this  area  to  extract  useful  guidelines  for  obtaining  optimal 
training  outcomes  based  on  this  variable. 

THROUGHPUT  VARIABLES 

8inwUt9g.B.tBUn 

This  review  did  not  attempt  a  fine*grained  analysis  using  simulator 
configuration  and  fidelity  levels.  The  primary  reason  for  this  was  the  lack 
of  detailed  descriptions  of  the  simulator  configuration  parameters  in  use 
during  experimentation.  Results  of  experiments  assessing  the  utility  of 
motion  cuing  for  both  jets  and  helicopters  were  questioned  because  they  lacked 
appropriate  methodological  controls,  such  as  periodic  calibration  checks  of 
the  motion  cuing  (hardware/software)  components. 

These  limitations  suggest  that  several  areas  are  in  need  of  further 
research,  In  all  cases,  close  attention  must  be  paid  to  experimental 
methodology  to  Insure  that  the  results  are  free  from  potential  competing 
explanations  concerning  the  source  of  the  observed  experimental  effects,  and 
reporting  in  detail  about  the  simulators  used  in  research  must  be  encouraged. 

Visual  Simulation.  Technological  advances  have  made  experimental  results 
from  early  visual  simulation  virtually  obsolete.  Research  must  be  advanced, 
particularly  in  determining  cue  requirements  for  low-level  flight. 

Motion  and  Force  Simulation.  Interest  remains  high  In  how  motion  and 
cuing  affect  training.  Methodological  considerations  are  especially  pertinent 
for  accurately  assessing  the  effects  of  motion  cuing  on  training  outcomes. 
Future  research  in  this  area  should  address  the  issue  of  task-specific  motion 
effects.  Detailed  reporting  of  results  for  individual  tasks  within  a  given 
experiment  will  provide  critical  information  for  determining  what  task  or  sets 
of  tasks  benefit  from  metion/force  cues.  In  addition,  this  information  may 
also  be  used  to  extrapolate  to  certain  emergency  situations  which  cannot  be 
trained  in  the  aircraft  for  safety  reasons. 

Training  Context 

Factors  within  the  training  domain  may  provide  the  highest  payoffs  for 
improving  training  outcomes.  Topic  areas  that  are  in  need  of  investigation 
include;  training  type  and  performance  measurement. 

Training  Type.  The  general  finding  reported  here  is  that  programs 
incorporating  a  proficiency  criterion  during  training  are  a.ssociated  with 
training  outcomes  approximately  twice  as  large  as  those  using  blocked  training 
procedures,  Given  the  nature  of  military  training,  use  of  proficiency 
criteria  during  training  may  not  be  feasible  in  all  Instances,  There  are 
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training  techniques  that  may  partially  substitute  for  proficiency  basic 
training.  These  techniques  can  he  applied  within  a  blocked  training  program 
and  may  boost  training  outcomes. 

For  example,  Bailey,  et  al. ,  (1980)  reported  tho  use  of  a  backward 
chaining  procedure  to  be  quite  effective  when  training  a  30«degree  dive-bomb 
maneuver.  This  procedure  involved  breaking  the  maneuver  into  several  steps, 
such  as  final  approach,  roll-in,  base  leg,  and  downwind  leg.  Training  then 
proceeded  in  reverse  order  through  the  steps,  thus  giving  the  student  ample 
practice  on  what  was  considered  the  most  critical  part  of  the  task  (i.e.,  the 
final  cask  segment;.  This  procedure  made  use  of  an  instructional  feature 
typically  found  on  most  full  and  many  part-task  simulators  (i.e,, 
initialization).  Appropriate  use  of  this  procedure  would  require  the 
instructors  to  learn  to  use  relevant  instructional  features  in  order  to 
Implement  the  backward  chaining  procedure.  In  this  regard,  at  least  one 
report  has  presented  evidence  indicating  Instructional  features  incorporated 
within  the  simulator  are  rarely  used  (Gray,  Chun,  Warner,  &  Eubanks,  1981; 
see  also  Tracey,  1984),  This  suggests  that  instructicnal  features  may  need  to 
be  accompanied  by  an  embedded  training  program  that  demonstrates  the 
application  of  relevant  learning  principles  and  procedures  for  each  available 
instructional  feature. 

lu  general,  the  challenge  for  researchers  and  training  developers  is  to 
devise  training  programs  for  instructors  that  will  enhance  training  outcomes 
for  blocked  training  programs  to  a  level  equal  to  programs  using 
proficiency-bised  criteria.  A  similar  challenge  was  given  by  Bloom  (1984)  to 
training  developers  in  the  psychological  and  educational  domains;  that  Is,  to 
devise  group  training  programs  that  will  equal  training  outcomes  expected  when 
training  is  on  a  one-on-one  basis. 

Performance  Measurement .  A  major  problem  with  the  research  in  this  area 
is  tho  almost  complete  reliance  on  subjective  instructor/piloc  ratings. 
Toward  the  goal  of  establishing  improved  subjective  performance  measures,  the 
need  to  document  inter-rater  reliability  information  in  experiments  is 
required.  Influential  and  far  reaching  decisions  are  being  made  based  on  the 
effectiveness  of  simulator  training,  compared  to  similar  training  in  the 
aircraft  (see  e.g.,  Orlansky  &  String,  1977).  Given  that  the  metric  of 
training  effectiveness  is  regularly  based  on  IP  ratings,  it  is  imperative  that 
these  measures  be  reliable  vrhen  used  in  an  experiment,  or  at  the  very  least, 
that  the  unreliability  of  these  measures  be  factored  into  the  decision 
process . 
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SUMKARY 

For  this  review,  issues  within  the  flight  simulator  training  domain  were 
separated  into  two  major  areas:  Input  variables  (task  equipment,  task 
requirements,  and  trainee  characteristics)  and  throughput  variables  (simulator 
design  and  training  context).  These  areas  coincide  with  components  of  the 
meta-model  depicted  in  Figure  1,  which  in  turn  were  derived  from  previous 
reviews  of  the  aviation  training  domain.  A  primary  goal  of  this  review  was  to 
identify  variables  that  moderate  the  magnitude  of  simulator  training  outcomes, 
Including  specific  design/fidelity  features,  from  results  of 
cransfer-oftralnlng  (TOT)  experments  in  this  area.  Experiments  were  included 
If  they  reported  sufficiently  detailed  information  to  allow  analysis  Involving 
appropriate  meta-analytlc  techniques.  A  second  goal  of  this  review  was  to 
provide  an  agenda  for  future  research  to  fill  information  gaps  derived  from 
results  of  the  meta-analysis.  Finally,  guldellnea  describing  information  that 
needs  to  be  repotted  in  future  research  publications  v^ere  generated  to  aid 
researchers  in  this  area.  These  guidelines  are  presented  in  Appendix  N,  and 
will  help  to  ensure  that  results  from  future  experiments  can  be  used  in 
subsequent  meta-analytlc  reviews. 

Lack  of  detailed  reporting  of  Information  concerning  training  methods, 
simulator  configuration,  fidelity  levels,  and  training  tasks  hampered  detailed 
analysts  in  these  areas.  Insufficient  statistical  information  resulted  in  the 
exclusion  of  a  number  of  experiments. 

The  major  findings  of  the  meta-analysis  are  as  follows; 

Task  Equipment 

The  outcomes  of  the  experiments  involving  the  training  of  Jet  pilots  were 
different  from  those  involving  the  training  of  helicopter  pilots,  Results 
differed  in  both  size  and  pattern  of  training  outcomes.  Jet  experiments 
onsistently  found  simulator  training  combined  with  aircraft  training  to  be 
better  than  training  in  the  aircraft  alone.  The  findings  from  similar 
helicopter  experiments  were  less  consistent,  and  only  slightly  favored 

simulator  training  combined  with  aircraft  training  over  aircraft  training 
alone . 

An  insufficient  number  of  helicopter  experiments  (total  N-7)  precluded 
any  in-depth  analysis  involving  this  type  of  aircraft.  Therefore  the  results 
of  the  meta-analysis  are  specific  to  Jet  aircraft  training  involving  recent 
Undergraduate  Pilot  Training  (UPT)  graduates  or  current  trainees  with  little 
or  no  experience  in  a  simulator  or  in  a  Jet  aircraft. 

For  Jets,  the  overall  training  effect  for  all  tasks  trained  was  positive 
and  robust.  Over  90  percent  of  the  experimental  comparisons  favored  the 
simulator  and  aircraft  trained  group  over  the  aircraft-only  trained 
group.  On  the  average,  subjective  performance  measures  (e.g,,  instructor 
ratings)  were  more  sensitive  to  training  effects,  and  produced  greater  results 
than  those  obtained  with  objective  measures  (e.g.,  instrument  readings).  As 
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training  for  both  groups  progressed  and  reached  the  point  where  it  was 
conducted  solely  in  the  aircraft,  differences  between  the  groups  diminished. 

Certain  tasks  were  more  affectively  trained  in  the  simulator  than  others. 
For  Jets,  when  simulators  were  used  for  the  training  of  takeoff,  approach  (to 
landing),  and  landing  (excluding  carrier  landings)  tasks,  the  training  effects 
were  greater  than  they  were  for  the  combination  of  all  tasks. 

Trainee  Characteristics 

Only  two  trainee  characteristics  were  identified  as  likely  to  have  an 
effect  on  training  results,  flight  experience  and  DPT  grades.  These 
differences  in  trainees  were  rarely  studied,  When  there  was  concern  that 
these  differences  might  affect  training  In  any  single  experiment,  an  effort 
was  made  to  compose  each  of  the  trainee  groups  with  equal  amounts  of 
experience  or  similar  grades, 

Simulator  Dee Ian 

For  Jet  training,  motion  cuing  was  found  to  add  nothing  to  the  simulator 
training  effectiveness,  and  in  some  cases,  may  have  taken  away  from  the 
training  value  of  the  simulator.  However,  this  finding  may  not  be  truly 
representative  of  the  effectiveness  of  motion-based  training  since;  1)  there 
was  a  lack  of  periodic  calibration  of  the  motion  cuing  systems;  and  2)  the 
results  were  based  on  all  tasks  combined.  The  positive  effects  of  motion  for 
any  one  task  may  have  been  masked  by  the  negative  effects  of  motion  for 
another  task. 

lMAn.tag  Cantaait 

The  average  effectiveness  for  training  programs  where  trainees  were 
allowed  to  progress  based  on  a  demonstrated  proficiency  was  greater  than  for 
training  programs  where  all  trainoes  proceeded  at  the  same  pace.  Information 
on  other  aspects  of  the  training  context,  such  as  the  use  of  instructional 
features  and  the  provision  of  feedback  was  seldom  reported  and  could  not, 
therefore,  be  analyzed. 
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APPENDIX  A 


Exparintnti  Excluded  from  the  Mete-enalytls 


AUTHOR,,  ,(YR) 


REASON  FOR..EXCm.S.lQ.H 


Bailey,  Hughes  &  Jones  (1980) 
Biersner  (1976) 

Billings,  Gerke  &  Wick  (1973) 
Brlccson  &  Brlendenback  (1981) 
Browning,  McDaniel  &  Scott  (1982) 
Burger  A  Brictson  (1976) 

Caro,  Isley  &  Jolley  (1973) 

Caro,  Islay  &  Jolley  (1975) 
Crawford,  Hurlock,  Padilla  A 
Sassano  (1976) 

Crosby  (1977) 

Demaree,  Norman  &  Mathaney  (1965) 
Edwards,  Weyer  &  Smith  (1979) 

Ellis,  Lowes,  Matheney  & 

Norman  (1968) 

Hagln,  Duvall  &  Smith  (1979) 

Ince,  Wllllges  &  Roscoe  (1975) 

Irish  &  Buokland  (1978) 

Jacobs  &  Roscoe  (1975) 

Jacobs,  Wllllges  &  Roscoe  (1973) 
Koonce  (1979) 

Krahenbuhl ,  Marett  &  Reid  (1978) 
Lintern  (1980) 

Prather,  Berry  &  Jones  (1971) 
Povenmire  &  Roscoe  (1971) 

Prophet  &  Boyd  (1970) 

Relcher,  Davidson,  Hawkins  & 

Osgood  (1980) 

Reid  A  Cyrue  (1974) 

Reid  6t  Cyrus  (1977) 

Roscoe  &  Wllllges  (1975) 

Ruocco,  Vitale  &  Benfarl  (1965) 
Ryan,  Scott  &  Browning  (1978) 
Thorpe,  Varnesey,  McFadden, 

Lemaster  &  Short  (1978) 

Smith,  Pence,  Queen  &  Wulfek  (1974) 
Woodruff  &  Smith  (1974) 

Woodruff,  Smith,  Fuller  & 

Weyer  (1976) 

Woodruff,  Smith  &  Harris  (1974) 
Young,  Jensen  &  Trelchel  (1973) 


No  Transfer 
No  Transfer 
No  Training 

Insufficient  Statistics 
Insufficient  Statistics 
No  Training  or  Transfer 
Insufficient  Statistics 
Insufficient  Statistics 


No  Transfer 
No  Transfer 
No  Training 
No  Transfer 


or  Transfer 


No  Training  or  Transfer 
Insuficient  Statistics 
No  Transfer 

No  Training  or  Transfer 
Not  Appropriate  Statistics 
No  Transfer 

Insufficient  Statistics 
Inappropriate  Measures 
Fixed  Wing  Not  Jet 


No  Transfer 

Insufficient 

Insufficient 

Insufficient 
Insufficient 
Insufficient 
No  Training 
No  Transfer 
Insufficient 

Insufficient 
No  Transfer 
Insufficient 

Insufficient 

Insufficient 

Insufficient 


Statistics 

Statistics 

Statistics 

Statistics 

Statistics 


Statistics 

Statistics 

Statistics 

Statistics 

Statistics 

Statistics 
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APPENDIX  B 

PRELIMINARY  CODE  SHEET  WITH  DESCRIPTION  OF  AREA  TOPICS 


heuoqpur 


eXPfiRIMBNTAL  COMPARISON: 


jr 

(Blrolt  oni) 


_ 1,  llmultlor  Training  vi.  Alrcritt*only  Training 

_ a.  Motion  VI.  No  Motion 

_ a.  LImItad  VI.  Full  Amount  ol  SImulitor  Training 

_ a.  Othar  Companion  (daiorlba) 


STUDY  ID  «i  _ 
OBVIOS  NAMB; 


CODER  ID: . 
AIRCRAFT: 


1.  RgPORTIO 
STATISTIC 
VALUKS) 

aOOmSSFONDMl 
VALUISi 
MEANS,  SDl, 
AN'i 


idaniilying  Irtlormillon: 
dip.  maaiuri,  group 
oompirlioni,  ate. 


Study 

Ohiraoiariiiloi 

Subjaot 

Ohiriotirlitloi 

Training  Langlh/ 
Orltirla 

Taili(i) 

Tralnod 

1.  Sub|.  italgnmont 

1.  Sinrioo  branoh 

1.  Total  numbir 

1,  Tetil  numbar 

hourinriili 

luki 

1.  Um  ol  milcriing 

t.  Prior  tiparionoi 

a.  Oritorii  ter 

3.  Tiik  olMiilloiiien 

3,  Study  oomparl- 

■dvinoamint 

(It  ipplloiOla) 

•ont 

3.  Riling  lyitim 

3.  TiiH  nafflM  by 

4.  Dip.  inaaiuta(i) 

UNd 

oliMilloiiion 

(11  ipplioibia) 

1.  Typo  ol  milalloi 

4.  Inlar.iiiar 

roportod 

rillabllliy 

COMMENTSlPnOBlSMS; 


1.  Daiign  laaturai,  luoh  ai  eountarbalartoing  IP*itudant  palringi  lor  training  and  aiiaiimant.  and  IP'i  blind  to 
group  aiiignmant  during  avaluatlon  phaaa. 

2.  Any  Irragulirltlai  thit  would  oNaet  Inlarnal  or  aitarnat  valKJi^  ol  itudy  rtoulli. 
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APPBNDIX  C 

ITEMS  DELETED  FROM  FINAL  CODE  SHEET  DUE  TO  UOK  OF 
INFORMATION/VARIABILITY  IN  AIRCRAFT  SIMUUTOR 
TRAINING  EXPERIMENTS  BY  COOING  AREA 


Tfalning  CharaetariattcB 

VThat  was  the  ins cructot- student  ratio? 

What,  was  the  Instructor's  reported  level  of  acceptance  for  use  of 
simulator  as  a  training  device? 

What  was  the  student's  reported  level  of  acceptance  for  use  of  simulator 
as  a  training  device? 

To  what  extent  did  training  incorporate  ISO  principles  and  procedures? 

Was  knowledge  of  results  (KOR)  given  to  studettt? 

If  KOR  was  given,  In  what  form  was  It  given  (l.e. ,  system  versus 
instructor  generated,  etc.)? 

Was  there  any  attempt  to  transition  students  from  simulator  to  aircraft? 

Were  IPs  trained  to  use  instructional  features  of  simulator? 

Were  any  pert- task  training  methods  employed  In  conjunction  with 
simulator  training  features? 

Simulator  FideU.t.Y  Charflci:e..rU.ticB 

Motion  system; 

Was  a  stick  shaker  or  buffet  system  used? 

Sound  system: 

*a)  Was  there  any  sound  simulation  used? 

Cockpit  display  and  flight  control  characteristics; 

*a)  Was  cockpit  (instrunientation/controls)  PHYSICALLY  similar  to  the 
transfer  aircraft? 

*b)  Does  Instrumentation/controls  FUMCTIONALLY  represant  that  in 
transfer  aircraft? 
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*c)  Was  Joystick  "feel"  similar  to  that  of  aircraft? 

b)  Were  the  simulator  flight  characteristics  validated  using  actual 
aircraft  data? 

Training  features; 

a)  What  special  environmental  features  ware  used? 

b)  What  special  training  features  were  used? 

''r  Indicates  Item  excluded  due  to  lack  of  variability  In  experiments  used 
In  analyses;  all  others  were  excluded  due  to  lack  of  Information  (i.e.,  not 
stated  in  report), 

Was  a  covarlate  used  to  reduce  error  variance  in  performance  measures? 

If  a  covarlate  wasf  used,  was  it  cognitive  or  non-cognltive  (e.g., 
personality  assessment,  physical  ability,  etc,)  in  nature? 

For  training,  were  instructor  pilot  (IP) -subject  pairings 
counterbalanced? 

For  assessment,  were  IP-subject  pairings  counterbalanced? 

Were  IP's  blind  to  type  training  given  students? 

If  IP  ratings  ware  used  to  assess  student  performance,  what  was  the 
reported  inter-rator  reliability  estimate? 

If  IP  ratings  were  used,  what  was  the  reported  intra-rater  reliabl  lity 
estimate? 

■'f  How  did  the  study  assess  student's  prior  flight  experience  (paperpencil 
test,  UPT  grades,  biickground  check,  etc.)? 

To  what  extent  was  the  training  progr,am  the  same  across  treatments? 

Indicates  item  excluded  due  to  lack  of  variability  In  experiments  used 
in  analyses;  all  others  were  excluded  due  to  lack  of  Information  (i.e.,  not 
stated  in  report). 
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APPENDIX  D 

CODE  SHEET  WITH  RESEARCH  FREQUENCIES  FOR  RESPONSE 
CATEGORIES  AND  MEANS  FOR  CONTINUOUS  ITEMS 

An  asterisk  (*)  on  an  Item  Indicates  that  an  experiment  may  be  coded  in 
nior«  than  one  category  on  that  item. 

1.  Experiment  ID  it:  _ d/k 

2.  Coder  ID  #:  N/A 

3.  Dace  of  publication: _ N/A 

4  Publication  Source: 

0  Book  0  Dissertaulon 

_ _ Journal  0  Paper  Presentation 

19  Technical  Report  0  Unpublished  Manuscript 

_ 2__  Other  (describe) _ 

S IMUUTOR/AIRCRAFT .  INFQPJlATlQN 

5.  Simulator  name/ldentiflcacion  code:  _ N/A _ 

6.  Aircraft  name/ldentif Icatlon  code;  _ N/A _ 

S-luDY  DESIGN  AND  SUBJECT  CHARACTERISTICS 

7.  Subject  assignment  to  groups  was: 

5  Random  Only  LO  Hatching  Only 

’i  Neither  1  Unstated/Unc  lear 

H.  Subjects  were: 

8 _  Recent  Undergraduate  Pilot  Training  (UPT)  Graduates 

0  Experienced  Pilots  Tran.si  cloning  to  New  Aircraft 
I  Mixed  -  Having  Both  High  And  Low  Experienced  Pilots 
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- 2-_  Other  (describe)  UP.  TRAINEES  (N-6) :  RECENT  AF  ACADEMY,  GRADS  :  F-14 

TRAINEES  ENTERING  AIR  REFUELING  STAGE:  FCLF 

mmBES 

*9,  Group  contrasts  consisted  of: 

_ Li2 _  Simulator  Training  Vtisua  Aircraft  Training  (Control) 

(SIMULATOR  TRAINING  vs  FLY  ONLY  CONTROL) 

— 5 -  Simulator  Training  With  Motion  System  On  Versus  Simulator  Training 

With  Motion  System  Off 
(MOTION  vs  NO  MOTION) 

__2 -  Full  Amount  Of  Simulator  Training  Versus  Limited  Amount  Of 

Simulator  Training  (FULL  vs  LIMITED  TRAINING) 

—2 -  other  (describe)  OFT  +  OFT  VS  OFT  ACONE:  OLD  VS.  NEW  SIM. 

aiMUIATQR  FIDELITY  CHARACTERISTICS 

10,  Visual  system: 

10a.  172,4  Horizontal  Fleld*of -view 

(Average  •  based  on  16  studies) 

10b,  ■■■199.9  Vertical  Field-of*vlew 

(Average  •  based  on  15  studios) 

10c,  Type  visual  system  used: 

11  Computer  Generated  Image  (CGI) 

4  Television  Model  Board 

- 1 -  Other  (describe)  .UNSPECIFIED _ 

3  No  Visual  System 

11,  Motion  system: 

Degrees-of- freedom  Of  Motion  System:  3  0  (fixed  base) 

— ^ ^  ^  ,  3  3  Q  4  1  5  12  6  0  Not  Stated 

lib.  Was  C.-aeat  Used?  _ i.,_  Yes  12  No  4  Not  Stated 

11c.  Was  Cl- suit  Usedr  1  Yes  ^  No  14  Not  Stated 
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12,  Tialning  features; 

Simulator  Is  considered  a;  15  Whole- task  Trainer 

4  Part- task  Trainer 
0  Not  Stated 

13,  Amount  of  simulator  training  was  determined  by: 

k  Proficiency-based  Criterion 

Ik  Blocked  Design  (all  subjects  received  an  equal  amount  of  training 
time) 

— i—  Other  (describe)  UNSPECIFIED _ 

0  Not  stated 

14,  8 .87  Number  Training  Hours  (summed  across  tasks) 

(Average  •  based  on  16  studies) 

l.j.  92.4  Number  Training  Trials  (stuamed  across  tasks) 

(Average  -  based  on  10  atudles) 

RESEARCH  MEASURES 

16.  Dependent  measures  were: 

_ 0. Exclusively  Objective  In  Nature 

9  Exclusively  Subjective  In  Nature 

10  A  Combination  of  Both  Objective  and  Subjective 
Measures 

17.  Is  information  explicitly  stated  for  determining  initial  transfer? 

5  Yes  14  No 

18.  Is  information  explicitly  stated  for  determining  final  transfer? 

_ 2_  Ves  _LL.  No 
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APPENDIX  B 
CODE  BOOK 

An  asterisk  (*)  Indicates  the  experiment  may  be  coded  In  more  than  one 
category  for  that  Item. 

1.  Report  ID  #:  _ 

Write  the  ID  #  In  the  space  provided.  The  report  ID  #  appears  In  the  top 
right  hand  corner  of  the  front  (title)  page  of  all  reports,  Journal  articles 
etc,  Some  reports  will  be  coded  more  than  once,  since  they  may  have  several 
comparison  groups  (e.g.,  Simulator  Training  versus  Aircraft  Training  and 
Simulator  Training  With  Motion  versus  Simulator  Training  Without  Motion) . 

2,  Coder  ID  #:  _ 

Write  the  coder  ID  #  In  the  space  provided: 

Carolyn  Prince  (1) 

Bob  Hays  (2) 

Eduardo  Salas  (3) 

John  Jacobs  (4) 

3.  Date  of  publication:  _ 

Write  the  data  In  the  apace  provided.  This  may  be  found  either  on  the  front 
(title)  page,  especially  if  it  is  a  Journal  article,  or  on  thu  "report 
documentation  page"  usually  placed  in  the  first  few  pages  of  a  technical 
report . 

4,  Publication  Source: 

_  Book  _  Dissertation 

_  Journal  _  Paper  Presentation 

_  Technical  Report  _  Unpublished  Manuscript 

_  Other  (describe)  _ 

Place  a  check  mark  In  the  appropriate  source  category. 
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aitjUIAIQB/^AlRCRAFT  IMFQRrtATlQM 

5.  Simulator  name/ldantif Icatlon  code;  ________________ 

Write  down  the  simulator'i  acxonymt  code  or  both  (if  applicable)  and  its 
classification  (OFT,  OPT,  or  U8T) .  This  Information  can  usually  be  found  In 
at  least  two  locations;  the  summary  of  results  section  of  the  "report 
documentation  page"  and  In  the  method  section  within  the  body  of  the  report. 

NOTE,  Simulators  may  be  Identified  nominally,  by  an  alphanumeric  code,  or 
both.  In  general  Air  Force  simulators  are  referred  to  by  an  acronym  made  up 
from  their  title,  such  as  the  "Advanced  Simulator  for  Pilot  Training"  or  ASPT. 
Navy  simulators,  by  design,  have  an  alphanumeric  code,  such  as  the  2F87F  used 
to  train  P>3  pilots.  A  notable  exception  to  this  is  the  2F103 ,  more  commonly 
referred  to  as  the  "Night  Carrier  Landing  Trainer"  (NCLT) .  Most  all 
simulators,  regardless  of  service  branch,  can  be  classified  as  either  a)  an 
operational  flight  trainer  (OFT),  b)  a  cockpit  procedures  trainer  (CPT) ,  or  c) 
a  weapons  systems  trainer  (WST) . 

6.  Aircraft  name/identification  code:  _ _ 

Many  aircraft  have  both  a  name  and  alphanumeric  identification  code.  An 
example  is  the  P>3  "Or  on",  Write  both  In  Che  space  provided.  If  both  are 
given,  or  at  the  very  Least,  write  the  ID  code,  This  information  can  usually 
found  in  the  summary  section  of  the  "report  documentation  page"  and  in  the 
method  section  within  the  body  of  the  report. 

RESEARCH  DESIGN  AND  SUBJECT  CHARACTERISTICS 

7.  Subject  assignment  to  groups  was; 

_  Random  Only _ Matching  Only 

_  Both  _  Neither 


Place  a  check  mark  in  the  appropriate  subject  assignment  category. 
This  Information  is  usually  given  in  the  procedures  section  within  the  body  of 
the  report. 

NOTE.  Random  selection  is  not  the  same  as  random  assignment .  If  the  report 
states  that  subjects  are  randomly  selected,  and  there  is  no  addltioiial 
Information  about  assignment,  place  a  check  mark  in  the  "neither"  category. 
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8.  Subjects  were; 

_  Recent  Undergraduate  Pilot  Training  (UPT)  Graduates 

_  Experienced  Pilots  Transitioning  to  New  Aircraft 

_  Mixed  -  Having  Both  High  And  Low  Experienced  Pilots 

_  Other  (describe)  ____________________________ 

Place  a  check  in  the  appropriate  subject  category.  This  information  is 
normally  located  in  the  beginning  of  the  method  section  within  the  body  of  the 
report  although  it  may  also  be  mentioned  in  the  summary  of  the  report  located 
on  the  "report  documentation  page". 

NOTE,  Be  careful  to  read  any  footnotes  pertaining  to  subjects,  since  they  may 
contain  important  Information,  such  as  subject  loss  or  the  fact  that  one  or 
more  subjects  had  additional  flight  experience. 

*9.  Group  contrasts  constated  of; 

_  Simulator  Training  Versus  Aircraft-Only  Training  (Control) 

(SIMULATOR  TRAINING  vs  FLY  ONLV  CONTROL) 

_  Simulator  Training  With  Motion  System  On  Versus  Simulator  Training 

With  Motion  System  Off 
(MOTION  vs  NO  MOTION) 

_  Full  Amount  Of  Simulator  Training  Versus  Limited  Amount  Of 

Simulator  Training  (FULL  vs  LIMITED  TRAINING) 

_  OTHER  (describe)  _ 

Place  a  check  mark  in  one  or  more  appropriate  group  contreat  category,  even  if 
there  appears  to  be  no  usable  data  from  the  contrast.  If  there  Is  usable 
data,  treat  the  separate  contrasts  as  separate  experiments  and  fill  out 
another  code  sheet.  Note  on  the  front  page  of  each  of  the  code  sheets  that 
this  is  the  first  (o  second,  or  third,  etc.)  code  sheet  for  this  experiment 
(next  to  the  study  ID  »)  and  circle  the  check  mark  involving  the  group 
contrast  category  (In  #9  above)  specifying  which  contrast  the  code  sheet 
corresponds . 

SIMULATOR  FIDELITY  CHARACTERISTICS 
10.  Visual  system; 

10a.  _  Horizontal  Fleld-of -view 


10b.  _  Vertical  Field-of -view 
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Write  the  exact  FOV  perametere  in  the  space  provided.  This  information  Is 
usually  found  in  the  method  section  within  the  body  of  the  report  when 
describing  the  simulator.  If  the  study  does  not  explicitly  state  this 
information,  but  cites  one  or  more  secondary  sources,  write  in  "N/S"  and  the 
secondary  source(u)  , 

10c.  Type  visual  system  used; 

_  Computer  Generated  Image  (CGI) 

Television  Model  Board 

_  Other  (describe)  __________________________ 

_  No  Visual  System 

11.  Motion  system; 


Degrees- 

of- freedom  Of 

Motion  System;  __ 

_  0  (fixed 

base) 

1 

2 

3  4  5 

6 

Not  Stated 

lib,  Was 

G-seat  Used? 

Yes 

Mo 

Not 

Stated 

11c,  Was 

G-suit  Used? 

Yes 

No 

Not 

Stated 

Place  a  check  mark  next  to  the  appropriate  category  Item  for  the  visual 
system,  motion  DOF,  and  use  of  0-seat  and  0*suit.  This  Information  is  usually 
found  in  the  method  section  within  the  body  of  the  report  when  describing  the 
simulator.  As  above,  if  the  report  does  not  explicitly  state  this 

information,  but  cites  one  or  more  secondary  sources,  write  in  "M/S"  and  the 
secondary  source (s) . 

NOTE.  If  the  report  states  that  one  or  more  of  these  systems  were 

"available",  but  doesn't  state  they  were  used,  check  the  "Not  stated"  category 
and  maki!  a  note  to  contact  the  author(s)  for  this  information. 

12.  Training  features; 

Simulator  is  considered  a:  _  Whole- task  Trainer 

_  Part-task  Trainer 

_  Not  Stated 

Place  a  check  mark  next  to  the  appropriate  simulator  classification  category. 
This  information  may  be  found  one  of  several  places;  the  report  summary, 
introduction,  or  method  section.  If  it  is  not  stated  in  the  report,  but  the 


E-4 


Technical  Report  89 >006 


same  simulator  Is  classified  in  another  report,  use  this  Information,  citing 
the  other  report  (with  page  number).  If  two  or  more  reports  give  conflicting 
classifications,  note  this  also. 

TRAINING  CHARACJERISTICS 

13.  Amount  of  simulator  training  was  determined  by; 

_  Proficiency-based  Criterion 

_  Blocked  Design  (all  subjects  received  an  equal  amount  of  training 

time) 

_  Other  (d'^scribe)  _ 

_  Not  stated 

Place  B  check  mark  next  to  the  appropriate  training  category,  This  information 
Is  usually  found  in  the  procedure  section  of  the  body  of  the  report  when 
describing  training  procedures  for  the  experimental  and  control  group,  In 
some  cases,  performance  outcomes  measures  are  trials- to-proflclency,  but 
training  wasn’t  stopped  once  proficient  performance  was  reached.  Thus,  a 
category  ocher  than  proficiency  training  should  be  checked  reflecting  the 
actual  training  procedure. 

14,  _  Number  Training  Hours  (total) 


15,  _  Number  Training  Trials  (total) 

Write  number(s)  in  space  provided.  Both  training  hours  and  trials  may  not  be 
given  within  the  report.  If  one  or  both  are  not  given,  write  "N/S".  This 
information  can  usually  found  in  the  procedure  section  within  the  body  of  the 
report  and/or  given  in  a  table  specifying  the  training  syllabus  used.  If 
conflicting  information  is  given  by  two  or  more  sources  within  the  report, 
note  both  with  accompanying  page  numbers  where  Information  is  found, 

NOTE.  The  term  trial  should  be  understood  to  mean  a  single  repetition  of  a 
given  task  or  set  of  tasks.  Trials  should  not  be  confused  with  sessions  on 
the  simulator  or  sorties  in  the  aircraft,  since  multiple  trials  may  occur 
within  a  given  session/sortie. 

RESEARCH -MEASI^RES 

16.  Dependent  measures  were; 

_  Exclusively  Objective  In  Nature 

_  Exclusively  Subjective  In  Nature 
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_  A  Combination  of  Both  Objective  and  Subjective  Measures 

Place  a  check  mark  next  to  the  appropriate  measure  category.  Dependent 
measures  are  usually  discussed  in  the  methods  section  within  the  body  of  the 
report.  Some  reports  include  additional  information  about  the  dependent 
measure(s)  in  an  appendix.  If  present,  read  this  information  carefully,  since 
it  may  prove  critical  for  determining  the  correct  classification  of  the 
measure(s) . 

NOTE.  Trials-to-proficiency  (criteria),  when  proficiency  is  determined  by 
subjective  IF  ratings  should  be  classified  as  subjective. 

17.  Is  information  explicitly  stated  for  determining  initial  transfer? 

_  Yes _ No 

18.  Is  information  explicitly  stated  for  determining  final  transfer? 

_  Yes  _  No 

Place  a  check  mark  in  the  appropriate  response  category.  Information  about 
initial  and  final  transfer  can  usually  be  found  either  in  the  method  section 
within  the  body  of  the  report,  in  a  table  summarizing  transfer-of- training 
performance,  or  in  an  appendix.  In  some  instances,  multiple  trials  (e.g.,  the 
first  through  the  third)  are  used  to  assess  initial  transfer  and  later  trials 
(e.g.,  the  seventh  and  eighth)  are  used  to  assess  final  transfer.  Note  the 
transfer  trial  used  for  both  of  these  measures  when  the  information  ia 
present, 
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AFPEKDZX  F 

PROCEDURES  FOR  CALCULATING  WEIGHTED  MEAN 
POINT  BI8ERAL  CORRELATION  COEFFICIENTS 


Procedures  for  coming  up  with  a  single  training  outcome  effect  size  (ES) 
estimate  for  individual  experiments  were  as  follows.  Experiments  reporting 
usable  study  statiatlcs  (i.e.,  values  baaed  on  j;-tests,  I-tests,  Chi  Squares, 
and  Mann-Whitney  U  tests)  were  converted  to  point  biserial  correlation 
coefficients  using  formulas  provided  by  Glass  et  .al.  (1981;  1970),  Wlien 
appropriate  study  statistics  were  omitted,  a  search  for  information  allowing 
calculation  of  usable  statiatlcs  was  performed.  In  many  such  cases,  means  and 
standard  deviations  were  obtained  and  subsequently  used  to  calculate  one  or 
more  t-  statistics.  If  the  number  of  subjects  (N'a)  for  corresponding 
experimental  and  control  group?  were  disparate,  the  resulting  £■  value  was 
corrected  using  a  formula  deccribed  by  Huntef  et  al,  (1982,  p.  99),  In  other 
cases,  reported  raw  data  were  used  to  calculate  an  F- statistic  or  Chi  Square, 
In  four  experiments,  information  was  reported  describing  the  size  of  a 
treatment  effect  using  only  a  p  value  (with  associated  treatment  means).  In 
order  to  render  this  value  usable,  it  was  first  transformed  to  a  i-value  based 
on  corresponding  degree.q  of  freedom  and  conservative  alpha  level  (using  a 
ore- tailed  distribution)  and  subsequantly  transformed  into  a  point  biserial 
correlation  coefficient,  In  all  cases,  whensvsr  more  than  one  research 
statistic  was  reported  in  a  single  experiment,  an  average  point  bi.serial 
correlation  coefficient  was  calculated  by  first  weighting  indlvlHual 
research  statistics  by  the  number  of  subjects  used  to  calculate  the  statistic. 
The  final,  weighted  mean  correlation  coefficient,  denoted  a.s  RPQ,  for  a  given 
experiment  has  an  attached  weight  equal  to  the  mean  number  of  subjects  used  to 
calculate  the  individual  research  statistics.  This  weight  is  used  when 
calculating  the  overall  (population)  effect  size  estimate  across  exper iii;enc.s , 

In  cases  where  there  existed  competing  values  that  could  be-  used  to 
estimate  u  training  outcome  effect  size,  the  most  conservative,  '.'nlue  wa,s 
chosen.  for  example,  when  converting  reported  p  values  to  their  corre.sponding 
i-value,  the  one-tailed  table  value  was  used  since  this  value  is  smaller, 
thereby  providing  a  more  conservative  effect  size  estimate  than  the  two- 
tailed  value.  The  exception  to  this  was  converting  Mann-Whitney  U  values  to 
corresponding  p-values.  Glass  ec  al,  (1981,  pp.  130-  131)  note-!  that  a 
U-statlstlc  at  the  .05  probability  level  corresponds  to  a  J;-statlstic  ac  the 
.03  or  .02  level  and  thereby  provides  a  more  conservative  effect  size  estimate 
than  the  standard  p-test.  Since  the  y-etatistic  is  more  conservative  than  a 
corresponding  l-statistic  (assuming  the  same  alpha  level)  ,  u.se  uf  the 
two-tailed  value  when  converting  the  former  to  the  latter  is  appropriate,  as 
opposed  to  a  smaller  one-tailed  value. 
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Note  1.  Conversion  of  and  £-valuea  were  done  using  a  FORTRAN-  based 
program  run  on  a  Zenith  248  mlcrooomputer .  This  "metatran"  program  was 
generously  supplied  by  Dr,  Terry  Dickinson  of  Old  Dominion  University 
Department  of  Psychology, 
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APPENDIX  0 


SUBGROUP  ANALYSIS  COMPARING  JET  AMD  HELICOPTER  EXPERIMENTS 


Aircraft 

Type 

Number  of 
articles 

RPB 

®  RPB 

s't 

%  un- 
explainei 
variance 

Jets 

10 

.26 

.03119 

.01979 

.01140 

37 

Helicopters 

3 

.02 

.00201 

.01786 

■ 

0 

Total  Group 

13 

.19 

.03452 

.01920 

.01532 

44 

Ngte .  RPfi  Is  weighted  mean  po^nt  biserlal  correlatlon^coeff Iclent . 

Is  observed  variance.  is  error  variance.  S^-,  is  "true" 

vaMance.  The  dash  (-)  should  b§  read  as  a  zero. 
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APPENDIX  H 

SUBOROUP  ANALYSIS  COMPARING  THREE  TYPES  OF  JET  EXPERIMENTS 


%  un- 

Experiment  Number  of  _  _  22  explained 

Type  articles  RPB  variance 


Simulator 

versus 

Aircraft -Only 
Training 

10 

.26 

.03119 

.01979 

.01140 

37 

Motion 

versus 

No-Motion 

5 

-  .05 

.03673 

.03706 

• 

0 

Other 

4 

.19 

.03559 

.01303 

.02256 

63 

Total  Group 

19 

.15 

.04256 

.02143 

.02113 

50 

RPB  is  weighted  mean  po^nt  bisarial  correlation, coef f icient , 
„pn  IS  observed  variance,  is  error  variance, 

r lance .  The  dash  (•)  should  be  read  as  a  zero. 


5“.^,  is  "true" 
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APPENDIX  I 

CONFIDENCE  INTERVALS  AND  RELATED  VALUES  FOR  ^ 
TABLE  VALUES  BY  AIRCRAFT  AND  EXPERIMENT  TYPE 


Result 

Classification 

Category  by  _  95%  Cl 

Aircraft  and  RPB  Values 

Experiment  Type  Value  Sd  (-to  +) 


JETS 

Simulator  versus  Aircraft-Only  Tralnlny 


(1) 

Overall  effect  size 

.26 

.014 

.23 

to 

.29 

(2) 

Objective  measures 
only 

.12 

.061 

o 

o 

to 

,24 

(3) 

Subjective  measures 
only 

.25 

.015 

.22 

to 

,28 

(4) 

Initial  transfer 
trial 

.19 

.032 

.13 

to 

,25 

(5) 

Final  transfer 
trial 

,03 

.224 

-.42 

to 

,48 

Motion  versus  No  Motion 

(1) 

Overall  effect  size 

-  .05 

.057 

-.16 

to 

,06 

HELICOPTERS 

Simulator  versus  Aircraft -Only  Training 

(1)  Overall  effect  size  ,02  .019  -.02  to  .06 


Note ,  RPB  value  is  mean  weighted  point  biaerial  correlation  coefficient. 
Sd  is  the  associated  standard  deviation  for  a  given  RPB  value.  Cl  means 
"confidence  interval". 


I-l 


Technical  Report  89-006 


APPENDIX  J 

SUBGROUP  ANALYSIS  OF  RESEARCH  CHARACTERISTICS  FOR  JET  EXPERIMENTS 


Moderator  Number  of  _ 

(by  subset)  articles  RPB 


RPB 


%  un- 

-  explained 
variance 


Simulator  Fidelity  Characteristics 


Did  the  visual  system  incorporate  computer  generated  Imaging  (CGI)  technology? 


Yes® 

No 


5  .21  i02768  .01720  .01049 

2  .50  .01139  .01785 


38 

0 


What  type 

trainer  was 

simulator ; 

Part-task'^ 

2 

.12 

.00010 

.01354 

Whole- cask 

8 

.35 

.02890 

.02355 

.00535 

0 

19 


Training  Characteristics 

What  type  training  system  was  employed? 

Blocked  6  .21  ,02076  .01972  ,00103  5 

Proficiency  3  .54  ,00623  .02461  -  0 


Research  Measures 


Old  the  dependent  measures  employed  In  the  experiment  Include  both  objective 
and  subjective  measures? 


Yes 

3 

.12 

.00010 

.01631 

- 

0 

No 

7 

.39 

,02376 

.02285 

.00091 

4 

Total  Group 

10 

.26 

.03119 

.01979 

.01140 

37 

Item  eliminated  since  It  failed  to  demonstrate  a  reduction  in  true 
variance  relative  to  that  of  the  total  group. 

Variables  Incorporated  into  only  two  (2)  experiments  were  reported  soley 
for  the  purpose  of  visual  inspection  and  should  not  be  considered  a  valid 
moderator  based  on  this  analysis. 

Ngte :  RPB  is  weighted  mean  point  biserial  correlation  coefficient. 

S  is  observed  variance.  is  error  variance.  is  "true" 

variance  A  dash  (■)  should  be°read  as  a  zero  (0). 
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AFPBNOZZ  K 

INTERCORREIATZON8  AMONG  RESEARCH  CHARACTERISTICS 
FOR  JET  EXPERIMENTS 


MMUItn 

ettAHACTaumci 

(i) 

(3) 

(3)  (4) 

X)  Daa  Of  Mtehing 
prior  to  auhjoot 
•aalfiwant 

1.00 

a)  Oaa  of  CQX 
viauaX 
ayataai 

-.940 

pe.lOl 

1.00 

1)  Total  POV  of 
viaual  ayatao  (C) 

-.312 

(•). 

pe.303 

.9I9 

(3) 

pe.oas 

1.00 

4)  OOP  of  notion 
ayataai  (C) 

.110 

(10) 

px.lUl 

-.399 

(7) 

p-.3ai 

.199  1.00 

(«) 

P-.174 

S)  Oaa  of  O'Oaat 

.303 

<«). 

pe.241 

N/A 

.191 
M/A  (1) 

P-.340 

4)  Oaa  of 
whola*taak 
•iaulator 

.310 

(10) 

P-.3V3 

-.399 

pe.3«l 

.349  .9344 

(•)  (10) 

p«.103  pe.OOO 

T)  Uaa  of 

profieianey* 
hawed  training 

.900 

(«). 

p-.oa9 

-.100 

(7) 

P“.3*17 

-.111  .130 

(•)  («). 
p>.l77  pa. 301 

•)  Uaa  of  ebjaetiva  and 
aubjaetiva  dapandant 

-.934 

(10) 

p-.OdO 

.400 

(7) 

p-.ia7 

.149  -.443* 

(9)  (10) 

P>.313  P-.019 

P)  Muabar  training 
houra  (C) 

.349 

(») 

p>. 104 

.000 

(*> 

p-.4y4 

.344  .474 

(*)  (•) 
po.iao  p-.oui 

10)  Nuabar  training 
trlaia  (C) 

.390 

(») 

pa. 314 

.499 

(9) 

pe.a30 

.941*  .114 

(9)  (*) 

pa. 004  pa. 309 

(S)  {*)  (7)  (•)  <•)  (^0) 


Metaai  Nuabar  in  paranthaaaa  li  nuabar  of 
aaparUanba  uaad  whan  ealeuIatliM  Paaraon 
ao^alation  aoafCielane.  M/A  indUataa 
oorralatien  ooafficiant  can  net  ba  eaX- 
ouUtad.  a  Indlcataa  CorraUtlon  e«af-  , 
fioiant  haa  (C)  indioatea  varlabla 

la  twntlnueua. 


1.00 

.914 

(•». 
pa.  303 

1.00 

.444 

(7) 

pa.  049 

.174 

(0) 

pe.lOl 

1.00 

-.391 

(«> 

P-.341 

-.744* 

(10) 

P-.009 

-.900 

(») 

p-.oa9 

1.00 

-.191 

(7) 

p-.lVl 

.412 

(V> 

P-. 139 

.911 

(4) 

p-.oa7 

-.447 

(*) 

P-.  114 

1.00 

N/A 

.114 

(b) 

pe.309 

N/A 

.001 

(0). 

P-.499 

.341 

P-.309 
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APPENDIX  L 


SUBGROUP  ANALYSIS  COMPARING  THREE  TYPES  OF  JET  TASKS 


%  un- 


Task 

Type 

Number  of 
articles 

RPB 

s2 

^  RPB 

s2 

e 

s't 

explained 

variance 

Takeoff 

3 

.65 

.02849 

.01438 

.01410 

50 

Approach 

3 

.64 

.00646 

.01273 

0 

Landing 

3 

.57 

,03081 

.01695 

.01385 

45 

Total  Group 

10 

.26 

.03199 

.01979 

.01140 

37 

va' 


;  RFB  is  weighted  mean 
.  Is  observed  variance 
ance . 


point  blserlal  correlation  coefficient, 
S^^  la  error  variance.  la  "true" 
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APPENDIX  M 

TASK  DIFFICULTY  SURVEY  FOR  JETS  TASKS 


BACKGROUND  INFORMATION 

RANK: _ 

#  YEARS  FLYING : 


EVER  BEEN  AN  INSTRUCTOR  PILOT?  YES  NO 
(circle  one) 


TYPES  OF  AIRCRAFT  FLOWN: _ 

(list  in  order  of  hours  of  experience  -  most  to  least) 

TYPES  OF  SIMULATORS  TRAINED  ON: (list) _ 

DIRECTIONS  >  On  the  following  pages  are  listed  several  maneuvers/tosks  often 
trained  using  a  simulator.  Use  the  1-3  scale  below  to  rate  each  task  in  torm.s 
of  how  difficult  the  task  is  to  learn.  If  you  first  learned  the  task  on  a 
simulator,  rate  how  difficult  the  task  was  to  learn  while  tralninf  on  the 
simulator  (as  opposed  to  aircraft),  Then  circle  the  item  number  corresponding 
to  the  simulator  trained  task  (see  the  example  on  the  cop  of  the  next  page), 

Place  a  rating  number  on  the  line  next  to  each  task.  Place  a  ZERO  (0)  next  to 
any  task  that  you  are  unsure  or  haven't  performed, 

1  2  3 


LOW  MEDIUM  HIGH 

DIFFICULTY  DIFFICULTY  DIFFICULTY 

1.  A  LOW  DIFFICULTY  task  is  one  in  which: 

-actions  are  clearly  defined 
-all  information  is  available 
-components  of  Cask  can  be  learned 
in  a  short  period  of  time 

2.  A  MEDIUM  DIFFICULTY  task  is  one  which: 

‘it  is  not  always  clear  what  your  actions  should  be 
-needed  Information  may  not  always  be  available 
-performance  of  the  cask  is  often  a  series  of 
actions  that  are  moderately  complex 
-there  is  some  stress  involved 
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3.  A  HIGH  DIFFICULTY  Cask  Is  one  which: 

•there  are  a  number  of  things  to  do 
-needed  information  may  not  be  present 
•adaptation  is  required 
-stress  is  moderate  to  high 
-actions  Chat  make  up  task  are  moderately  to 
very  complex 

EXAMPLE;  (helicopter  tasks) 

2  1.  Takeoff  to  Hover  (simulator  trained  task) 

1  2 .  Landing  from  Hover 

3  3.  Confined  Area  Approach  (simulator  trained  task) 


General  Description;  Air-to-air  combat  maneuvers 


Maneuver/task  description 

1, 

Acceleration  Maneuver 

2. 

High  Yo-Yo 

W  t 

Quarter  Plane 

4. 

Barrel  Roll  Attack 

5. 

Immelmann  Attack 

6. 

Lag  Roll 

7. 

Separation 

8. 

Tactical  Formation 

9, 

Set  up  on  Perch 

10, 

Defensive  Maneuvers 

11, 

Low  Yo-Yo 

12. 

Lag  Pursuit 

13, 

Roll ing  Scissors 

14, 

Guns  Defense  (High-G  Barrel 

15, 

Head  On  Maneuvering 

16. 

Atoll  Extension 

(Related  Skills) 

A.  Descriptive  Commentary 

B.  Range  Estimation 

C.  Target  Acquisition 

D.  Kept  Bogey  in  Sight 

E.  Weapons  Parameter 
Recognition 

F.  Switchology 


Rolls) 


General  description:  Four  engine  jets  only  (if  you  have  never  flown  this  type 

aircraft,  please  skip  this  section) 

Maneuver/task  description 

_ 17,  Abort  Four  Engines 

_ 18.  Abort  Three  Engines 

_ 19.  Engine  Failure  After  Refusal 

_ ?0.  Departure 

_ 21,  Holding 
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_ 22.  TACAN/VOR 

_ 23.  LOG 

_ 24.  GCA 

_ 25.  ILS 

_ 26.  Normal  Landings 

- 27.  Approach  Flap  Landings 

_ _ 28.  Waveoff 

- _29,  three  Engine  Landings 

General  Description;  Airwork  maneuvers 

tl&afl.uver/task 

- 30.  One  Engine  Failure  at  TO 

- - 31.  Two  Engine  Failure  at  TO 

— - . -32 .  Low  Altitude  Restart 

- 33,  One  Engine  Approach 

_ _ 34,  Hydraulic  Failure 

_ _ 35,  Slow  Flight 

_ 36.  Takeoff 

- 37,  Straight  in  Touch-and-go 

38 .  Go  Round 

_ 39.  TO  Climb 

_ ^40 ,  Landing 

- 41.  Traffic  Pattern  Stall 

_ ^^2.  Control  Response 

_ 43.  Trim 

_ _ Straight-and- level 

- 45.  Pitch  I  Bank,  and  Power 

Str.l8ht..na.l,v.i 

_ 48 ,  CAS  Descent 

_ ^9.  CAS  Climb  Turn 

_ 50.  Level  Offs 

_ 51.  Level  Turns 

_ 52.  Change  of  Airspeed 

- 53.  Traffic  Pattern  Steep  Turns 

_ 54,  30  deg.  flank  Turns 

_ 55,  45  deg.  Bank  Turns 

_ 56.  60  deg.  flank  Turns 

- _ 57,  Turn-to-headings  (TH) 

- 58,  Airspeed  Changes  While  TH 

_ 59.  Tech  Order  Climbs 

_ 60,  Configuration  Change 

- ll'  Descending  Left  Turns 

_ 62,  Traffic  Exits 

- 63.  Straight  in  Approach  Landings 

- 64.  360  deg.  Traffic  Pattern 

_ _ 65.  Power-on  Stalls 

- 66.  Constant  Air  Spead  (CAS)  Dascendlng  Turn 
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_ 67.  Vertical-S-Delta 

_ 68,  Carrier  Qualification  (CQ)  Landings 

_ 69.  Night  CQ  Landings 

_ 70.  Field  Carrier  Landing  Practice  (FCLP) 

_ 71.  Night  FCLP 

_ 72.  Bomb  Delivery  Approach 

_ 73.  Bomb  Delivery  Release 

_ 74.  Air-to-air  Refueling 

General  Description:  Formation  Flying 

Maneuver/task  descriotion 

_ 75.  Fingertip 

_ 76.  Croaaunder 

_ 77,  Turning  Rejoin 

_ 78,  Wlngwork  (Fingertip  at  15-30  deg.  bank) 

_ 79.  Procedures  (start  up  &  shut  down) 

_  80.  Aborted  TO 


General  Description:  Aerobatics 

{lant-VLYti/ttaiK  dfsrtPtUn 

_ 81 .  Aileron  Roll 

_ 82.  Split  S 

_ 83.  Loop 

_ 84.  Lazy  8 

_ 8  5 .  1 mme Iraann 

_ 86 .  Cuban  8 

87.  Cloverleaf 


DIRECTIONS:  Now  go  back  and  for  tasks  having  prior  simulator  training 
(circled  items),  place  a  plus  (  +  )  next  to  the  rating  if  actual  perforinnnca  in 
aircraft  was  noticeably  harder  than  in  simulator.  Place  a  dash  v-*)  if 
performance  in  aircraft  was  noticeably  easier  than  in  simulator  (see  example 
below) . 

EXAMPLE ; 

■  2  1.  Takeoff  to  Hover  (simulator  trained  task  that  is  ea,sier  in  aircriift) 

1  2 .  Landing  from  Hover 

■f3  3.  Confined  area  Approach  (simulator  trained  task  that  is  harder  lii 

aircraft) 
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APPENDIX  N 

GUIDELINES  FOR  REPORTING  FUTURE  EXPERIMENTAL  RESULTS 


Introduction 

The  problems  associated  with  conducting  transfer-of- training  (TOT) 
research  in  the  aviation  training  domain  are  well  documented.  It  Is  fully 
expected  that  future  research  in  this  area  will  be  plagued  by  similar  problems 
and  that  experimental  rigor  will  suffer  accordingly.  These  guidelines  are 
meant  as  an  aid  for  those  attempting  TOT  experiments  so  that  sufficient 
information  will  be  available  for  subsequent  meta-analytlc  review.  While 
meta*analysis  has  its  own  set  of  problems,  it  offers  a  unique  perspective  for 
answering  many  long-standing  questions  about  the  nature  of  simulator  training. 

In  general,  consider  that  any  information  that  is  not  explicitly  stated 
within  the  report  can  not  be  assumed  by  the  reviewer.  For  example,  several 
reports  included  in  this  review  noted  that  subjects  were  randomly  selected 
from  a  class  of  student  aviators.  However,  no  mention  was  made  concerning 
student  assignment  to  the  experimental  conditions.  The  following  items  are 
sources  of  Information  that  should  be  addressed  when  reporting  results  of 
experimentation  in  this  area. 

1)  Research  des-lan 

a)  Matching  -  State  whether  subjects  were  matched  prior  to  assignment  to 
the  experimental  conditions.  Describe  the  variable (s)  used  for 
matching  and  the  outcome  of  the  matching  procedure. 

b)  Subject  assignment  •  State  the  procedure  for  assigning  subjects  to  the 
experimental  conditions.  If  random  assignment  wasn't  possible  or  was 
compromised  in  any  way,  report  Information  about  how  It  affected  the 
various  groups. 

c)  Loss  of  subjects  •  Attrition  may  occur  for  a  variety  of  reasons  and 
information  concerning  subject  loss  must  be  described  In  detail, 
including  procedures  used  when  performing  statistical  analyses  (c.g., 
adjusting  degrees-  of-freedom). 

d)  Bias  reduction  procedures  -  Counterbalancing  and  having  raters  blind 
to  subject's  experimental  assignment  are  common  procedures  used  to 
reduce  potential  measurement  bias.  Their  use  (or  non-use)  should  be 
chronicled, 

e)  Estimation  of  rater  agreement  •  Inter-  and  (if  possible)  Intra-rater 
agreement  should  be  assessed  end  reported.  If  objective  measMrc^.s  are 
used  in  conjunction  with  with  subjective  Indices,  report  the  vela;  lor. 
(e.g.,  l)  between  these  measurement  types. 
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2)  Performance  measures  and  etatlstlcal  analysis 

a)  Performance  measures  -  A  detailed  description  of  each  performance 
measure  should  be  given.  If  an  established  measure  is  used,  cite 
relevant  literature  describing  each  measure  and  give  pertinent 
information  concerning  its  application  within  the  experiment. 

b)  General  statistical  reporting  requirements  •  Reporting  detailed 
information  about  all  statistical  analyses  is  a  must.  In  general, 
means,  associated  standard  deviations,  and  number  of  subjects  must  be 
reported  for  separate  analyses,  even  when  multivariate  procedures  are 
used. 

c)  Commonly  used  statistical  procedures  -  Common  statistical  procedures 

used  to  describe  the  magnitude  of  between-group  differences  include  £• 
and  H-tests.  The  value  of  the  statistic  should  be  reported  ns  well 
as  the  associated  p-value.  An  indication  of  the  direction  of  the 
statistic,  relative  to  the  stated  null  hypothesis,  should  also  be 
stated,  and  apriori  versus  aposteriorl  analyses  should  be  delineated. 
The  same  basic  requiraments  apply  when  reporting  analyses  based  on 
non>parametrlc  procedures  (e.g.,  chi  square).  Reporting  exact 
p'values  is  imperative  when  reporting  results  of  non-pnrnmetric 
analyses.  I 

d)  Areas  of  specific  interest  •  Currently,  issues  surrounding  task>type 

and  the  extent  of  TOT  performance  are  in  need  of  investigation. 
Accordingly,  information  about  individual  tasks  should  be  reported 
separately  as  well  as  that  for  individual  transfer  trials  (if 

appropriate) . 

e)  Covarlates  •  Use  of  any  cognitive  or  non-cognitive  variables  as 

covariates  should  be  reported.  The  relationship  (i)  between  the 
covariate  and  associated  criterion  performance  variable  should  be 
given,  and  If  appropriate,  the  Interclass  correlations  between  the 
various  criterion  measures. 

3)  icfllning,  ghflractscIaJSis.s 

a)  General  training  features  -  Describe  the  extent  to  which  the  various 

training  programs  used  In  the  experiment  were  alike  and  how  they 
differed  in  terms  of  relevant  training  parameters  (e,g.,  time  to 

complete  training,  number  of  training',  trials)  . 

b)  Instructor  variables  -  Report  the  instructor -student  ratio  as  well  as 

the  use  of  specific  simulator  Instructional  features  by  the 

Instructor.  Describe  any  training  given  to  Instructors  on  the  use  of 
instrvictional  features.  Describe  the  Instructor's  level  of  acceptance 
of  the  simulator  as  a  training  device  (this  information  mny  bi> 
task- specif tc) .  Desciibe  the  extent  to  which  student  motivation 
influences  instructor  ratings. 
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c)  Student  variables  -  Report  the  student's  level  of  acceptance  of  the 

simulator  as  a  training  device,  Describe  any  attempts  at 

transitioning  student  from  simulator  to  aircraft. 

d)  Training  program  •  If  applicable,  describe  the  development  of  the 
training  pft3ram(s)  with  regard  to  ISD  principles  and  procedures  or  at 
least  cite  references  providing  this  information.  Describe  how 
knowledge -of -results  (KOR)  was  given  to  the  student  (e.g.,  how  often, 
in  what  form).  Describe  any  part-task  training  methods  employed. 

A)  Simulator  fidelity  characteristics 

a)  General  -  Describe  the  level  of  physical  and  functional  fidelity  for 
each  of  the  simulator's  subsystems  (l.e.,  sound,  motion/forco ,  visual, 
cockpit  display,  and  flight  control  characteristics).  The  key  here  is 
reporting  use,  and  not  Just  availability,  of  specific  simulator 
components  (e.g.,  g-seat,  g-sult,  stick  shaker  system). 

b)  Specific  areas  of  interest  -  Describe  how  the  simulator  flight 

control  characteristics  were  validated  (example:  seac-of-pants  vs. 

data  from  actual  aircraft).  If  use  of  motion/force  cuing  is  a  primary 
experimental  manipulation,  provide  Information  concerning  calibration 
of  hardware/software  parameters  (at  very  least,  report  results  of 
calibration  tests  prior  to  and  after  experimentaticn) . 
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